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SEASONAL ADJUSTMENTS BY ELECTRONIC 
COMPUTER METHODS* 


Jutrus SHIskiIn AND Harry EIsENPRESS 
Bureau of the Census 


I, INTRODUCTION AND SUMMARY 


URING the past few years, electronic computer programs for seasonally ad- 
D justing time series have been developed at the Bureau of the Census and 
improved and extended at the National Bureau of Economic Research. The 
electronic computer programs have been made available to other organizations 
and seasonal adjustments are now being made in several parts of the country on 
several different machines. More than 3,000 series had been adjusted for 
seasonal variations on electronic computers by mid-1957 and these series are 
being released in seasonally adjusted form by the responsible statistical agen- 
cies 


The electronic computer programs described in this paper have a limited 
objective—to eliminate the heavy burdens and high costs previously required 
for seasonal adjustments of time series and, consequently, to make seasonally 
adjusted data available for all important series. This paper does not try to 
resolve the many complex conceptual problems implicit in the decomposition 





* Revision of paper presented at a joint meeting of the American Statistical A iation and the Econometric 
Society, session on Applications of Electronic Computers to E ic Statistics, D ber 27, 1955, in New York, 
N.Y. 

The revised paper has been approved for publication, as « report of the National Bureau of Economic Research, 
by the Director of Research and the Board of Directors of the National Bureau in accordance with the resolution 
of the Board governing National Bureau reports (see Annual Report of the National Bureau of Economic Re- 
search). It is to be reprinted as No. 12 in the National Bureau's series of Technical Papers. 

Many persons and organizations have made important contributions to our work on the use of electronic com- 
puters for seasonal adjustments of time series. Almost all the different groups utilizing the Census Univac service 
offered suggestions, and some of the strong points of the present method are their contribution. The staffs of the 
Bureau of the Census, the National Bureau of Economic Research, the Board of Governors of the Federal Reserve 
System, the Department of Agriculture, and the Department of Trade and Commerce in Canada should be specifi- 
cally mentioned. 

Thanks are also due to Howard C. Grieves and Morris H. Hansen of the Bureau of the Census for encourage- 
ment and practical assistance in the first stages of the project; to Arthur L. Broida of the Federal Reserve System 
and Maxwell R. Conklin of the Bureau of the Census for valuable suggestions and criticisms of the early work; to 
Geoffrey H. Moore of the National Bureau of Economic Research for similar contributions more recently; to Max 
A. Borshad of the Bureau of the Census for painstakingly reviewing ard improving several early drafts of this paper; 
to a National Bureau of Economic Researck staff committee consisting of Millard Hastay, Ruth P. Mack, and 
Victor Zarnowits, and to W. Allen Wallis of the University of Chicago for helpful criticisms of a later draft; to 
Gladys F. Webbink for editorial suggestions; and to H. Irving Forman for drawing the charts. For assisting with 
the Univac programming, the writers are indebted to Lancelot W. Armstrong, George M. Heller, James L. McPher- 
son, and the late Edward I. Lober, all of the Bureau of the Census. 

During the 1956-57 academic year, while the writers were on leave of absence from the Bureau of the Census, 
working with the staff and records of the National Bureau of Economic Research, both Univac time and program- 
ming resources were provided by the Remington Rand Division of the Sperry-Rand Corporation. This project was 
supported by a grant from the National Science Foundation. 
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of economic time series and more specifically in the adjustment of series for 
seasonal variations, but only to show that the present electronic computer 
methods generally yield results of at least the same order of quality as the best 
clerical methods. There is little doubt, however, that the use of electronic 
computers, by forcing us to make explicit our assumptions at each stage of the 
work and enabling us to make comprehensive tests of the results, already has 
thrown considerable light on these problems and led to some improvements 
over the techniques previously used. 

This paper describes the two methods developed at the Bureau of the 
Census and compares the results. The first method is a mechanical version 
of the familiar and widely used ratio-to-moving-average method and the sec- 
ond a refinement of the first. In the newer method the trend-cycle curve is 
traced out by a weighted fifteen-month moving average which provides a 
flexible yet smooth graduation. Smooth curves are also fitted to the seasonal- 
irregular ratios to provide seasonal adjustment factors, and follow the ratios 
for the full period of the data. Extreme values among the ratios are isolated 
automatically by a built-in system of control charts and are replaced by aver- 
ages of the extreme ratio and surrounding ratios. Series as short as six years 
and as long as thirty years can be seasonally adjusted, and quarterly as well as 
monthly data can be handled. 

Comparisons for a large number of different types of series show the second 
method to be superior. Comparisons with adjustments carefully made clerically 
by three different statistical organizations indicate that the results are at least 
as good as manual adjustments of the same series. These comparisons indicate 
that this electronic computer program has brought us fairly close to providing 
on a mass basis a fully mechanical method of making seasonal adjustments as 
good as those previously prepared for only a small number of series by a com- 
bination of laborious hand computations and professional judgments. For the 
few series where this is not the case, the electronic computer program provides 
data which can be converted to satisfactory seasonal adjustments with only a 
small amount of additional hand manipulations. Some of the kinds of series 
for which Method II is likely to yield inadequate adjustments are described, 
also. 

Continuing studies are being made to find ways of reducing the number of 
unsatisfactory adjustments, and the resulting refinements of the method will 
improve it still further. Nevertheless, professional review of the results, par- 
ticularly for the initial and terminal years of series, still is, and probably always 
will be, necessary. 


II, SEASONAL ADJUSTMENTS BY METHODS I AND II 


The first seasonal method programmed for the Census Bureau work, Method 
I, is an adaptation and elaboration of the familiar ratio-to-moving-average 
method at its most advanced stage of development.' A series reflecting the 


1 See, for example, F. C. Mills, Statistical Methods (New York, 1955), pp. 360-375; F. E. Croxton and D. J. 
Cowden, Applied General Statistics (New York, 1955), pp. 320-363; W. A. Wallis and H. V. Roberts, Statistics: A 
New Approach (Glencoe, [linois, 1956), pp. 580-586; A. F. Burns and W. C. Mitchell, Measuring Business Cycles 
(New York, 1946), pp. 43-55; H. C. Barton, Jr., “Adjustment for Seasonal Variation,” Federal Reserve Bulletin, 
v. 27 (1941), pp. 518-528. . 
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trend-cycle components is estimated by a twelve-month moving average of the 
original observations. This estimate is then divided into the original observa- 
tions to obtain a series reflecting the seasonal and irregular components. For 
each month, a moving average curve is next fitted to the time series representing 
the seasonal-irregular component for that month in successive years in order 
to obtain estimates of the seasonal factors alone. This last step yields twelve 
sets of moving seasonal factors, one for each month. The seasonal factors for 
each year are then “centered” so that their sum equals 1,200. 

An iterative procedure is used: first, the seasonal factors obtained as above 
are divided into the original observations to obtain a preliminary seasonally 
adjusted series, representing the trend-cycle-irregular components. This series 
is, in turn, smoothed by a five-month moving average to provide a trend-cycle 
curve that is more flexible than the twelve-month moving average; that is, a 
five-month moving average can change direction over a short interval, so that 
it follows fairly sharp peaks smoothly, as well as shallow ones. The sequence of 
computations iirst made on the twelve-month moving average is then repeated 
on the five-month average to yield the final seasonally-adjusted series. 

Altogether, the method yields nineteen tables which show the successive 
stages of tht: computations from the original observations to the final season- 
ally adjusted series.” Included are five different moving averages, two sets of 
ratios to moving averages, two centered and two uncentered sets of moving 
seasonal factors, two seasonally adjusted series, and five tests of the work. 
Method I is described more fully below (Section III) in the course of the explana- 
tion of the changes made for Method IT. 

The present writers studied the results of this method as applied to many 
series and also discussed it with other time series analysts who made similar 
studies. There is general agreement that this method is very good; that while 
it is sometimes possible to make a better adjustment for a single series or a 
few series, up to now it has not been possible to make adjustments of such high 
quality for large numbers of series. Nevertheless, a number of weaknesses have 
become evident. The possibilities of correcting these weaknesses depend partly 
upon the ingenuity of statisticians, but also upon the availability of a facility 
for carrying out masses of computations rapidly at low cost. The electronic 
computer comprises such a facility. The writers, therefore, carefully examined 
each one of these faults and proceeded to develop methods of overcoming them. 
These improvements have been incorporated in a revised seasonal method— 
Method II. 

Method II follows the general procedure of Method I but takes advantage 
of the great capacities of electronic computers for statistical computations, 
by utilizing more powerful and refined techniques and producing more informa- 
tion about each series. Thus, it substitutes weighted for simple moving averages 
and isolates and reduces the weight of extreme items more selectively. It com- 
putes measures of the relative significance of the trend-cycle, seasonal, and 
irregular components of each series and uses these relations automatically to 
guide the course of subsequent computations. It adds a new basis for judging 
the validity of the seasonal adjustment to those provided by Method I. It 


2 See Julius Shiskin, “Seasonal Computations on Univac,” American Statistician, February 1955, pp. 19-23. 
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provides optionally a constant seasonal index for special uses. It computes 
month-to-month percentage changes in the series and its components. It also 
produces point charts for the convenience of its users.* The full array of data 
now provided by this electronic computer program is shown for an illustrative 
series in Table 1. 

The principal features of Method II are: 

1. It computes a preliminary seasonally adjusted series which follows pri- 
marily the conventional ratio-to-moving-average technique. It starts with 
ratios computed by dividing the original observations by a twelve-month 
moving average; it computes moving seasonal adjustment factors from these 
ratios; and it obtains a seasonally adjusted series by dividing these preliminary 
seasonal adjustment factors into the original observations. 

2. It utilizes a complex graduation formula—a weighted fifteen-month 
moving average—as the estimate of the trend-cycle curve used to obtain the 
final seasonally adjusted series. For most series this formula yields a curve 
which is flexible, follows the data closely, and gives a smooth representation of 
the trend-cycle components. 

3. It utilizes a control chart procedure to identify extreme items among the 
seasonal-irregular ratios and systematically reduces the weight of these ex- 
tremes for the subsequent computations. For each month control limits of two 
standard errors are determined above and below a five-term moving average 
fitted to the seasonal-irregular ratios. Any ratio falling outside the limits is 
designated as “extreme” and is replaced by the average of the “extreme” ratio 
and ratios immediately preceding and following. 

4. It utilizes weighted moving averages of the seasonal-irregular ratios for 
each month to obtain the seasonal adjustment factors; for example, a three- 
term moving average of a three-term moving average, which is equivalent to a 
five-term moving average with the weights, 1, 2, 3, 2, 1. 

5. It utilizes a measure of the irregular component of each series to determine 
the type of moving average to fit to the seasonal-irregular ratios. The larger the 
irregular component, the larger the amount of smoothing that is carried out. 
Alternative graduation formulas, each appropriate for series with irregular com- 
ponents of different magnitude, are placed in the computer memory and auto- 
matically selected according to the average month-to-month amplitude of the 
irregular fluctuations. 

6. It takes into account changing trends in calculating seasonal adjustment 
factors for the first and last few years of each series. Instead of following the 
usual procedure of extrapolating the seasonal adjustment factor curve to the 
end of the series, this new method takes an average of the last two seasonal- 
irregular ratios for a given month as the estimated value of eack of the following 
two or three ratios. These estimates are then used in computing the two seasonal 
factors that would otherwise be missing at the end of the series. A similar pro- 
cedure is used to obtain missing values for computing the ends of the trend- 
cycle curve. 

The electronic computer programs for Methods I and II provide for working 
or trading day corrections where they are needed. The working or trading day 





* For a description of how these charts are prepared, see Harry Eisenpress, James L. McPherson, and Juliu 
Shiskin, “Charting on Automatic Data Processing Systems,” Computers and Aut tion, August 1955. 
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correction factors must, however, be available, for punching or taping, along 
with the original observations; there is no technique built into the electronic 
computer program for estimating such factors. The working day correction is 
accomplished by the modification of the original observations, in the electronic 
computer routine, before they are started through the seasonal adjustment 
process. 

The faults in Method I and the methods for overcoming them which have 
been adopted in Method II are described below and comparisons of the seasonal 
adjustments made by Methods I and II are shown and analyzed for several 
economic series. A detailed description of each of the steps in these seasonal 
methods can be obtained by writing to the authors. 


III, FAULTS OF METHOD I AND THEIR IMPROVEMENT IN METHOD II 
1. Improvements in the Trend-Cycle Curves 


(a) Smoothing the trend-cycle curves: The five-month moving average of the 
preliminary seasonally-adjusted series, which has been used in Method I as the 
underlying trend-cycle curve, occasionally yields a somewhat irregular curve, 
although for most series it produces better results than earlier methods based 
on a 12-month moving average of the original series. Nevertheless, for series 
with large irregular components, the 5-month moving average does not result 
in a smooth delineation of the trend-cycle components of the series. (See, for 
example, Chart 1.) , 

With the burden of computations no longer a factor, the writers were able to 
turn to the large array of complex graduation formulas previously developed 
by others to select a curve which is as flexible as, yet smoother than the five- 
month moving average. 

It seems fairly clear to students of this problem that there is no single gradu- 
ation formula which best delineates the underlying cyclical movements of all 
economic series.‘ Perhaps it may be possible eventually to develop criteria for 
selecting a particular graduation formula for each series according to the types 
of cyclical and irregular fluctuations characteristic of that series. Then with 
electronic computer programs for a large number of different graduation for- 
mulas available, the computer would calculate measures of the cyclical and 
irregular components in each series, and on the basis of these select the smooth- 
ing formula most suited to each particular series. The writers have tried to 
make such a start; however, its development is for the future. For the present, 
because of the time that will be required to develop a conceptual basis for this 
idea and to prepare the electronic computer programs, the writers have selected 
a single graduation formula to measure the trend-cycle factors. 

Graduation formulas are available which provide smooth and flexible curves 
and also eliminate seasonal fluctuations; for example, Macaulay’s 43-term 
formula. But such formulas involve the loss of a relatively large number of 
points at the beginnings and ends of series. Graduation formulas which provide 
similarly smooth and flexible curves and involve the loss of relatively few points 
do not also eliminate seasonal variations. The computation for a preliminary 
seasonally adjusted series is now easy mechanically; on the other hand, the 





4 See,for example, Arthur F. Burns’and Wesley C. Mitchell, op. cit. Chapter 8, esp. p. 320. 
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Cart 1. Comparison of Spencer 15-month weighted moving average and 
simple 5-month moving average. 
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replacement of missing points is difficult conceptually. We, therefore, chose one 
of the formulas which requires a preliminary seasonally adjusted series, but 
also minimizes the loss of points—the Spencer fifteen-month weighted moving 
average. 

The Spencer formula appears well suited for the purpose at hand: For most 
series it gives a smooth representation of the trend-cycle components, and fits 
the data as closely as a simple five-month moving average. The weights of the 
Spencer graduation are as follows: —3, —6, —5, 3, 21, 46, 67, 74, 67, 46, 21, 
3, —5, —6, —3. This weighting scheme is equivalent to taking a five-month 
moving average of a five-month moving average of a four-month average of a 
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four-month moving average of the data, with weights of —3, 3, 4,3, —3 applied 
to either of the two five-month moving averages.’ This graduation formula also 
has the property of fitting a third degree polynomial exactly. The marked im- 
provement in smoothing that can result from the use of the Spencer formula in 
place of the simple five-month moving average is illustrated in Chart 1. The 
greater the amplitude of the irregular movements in a series in proportion to 
its cyclical movements the more advantageous will be the use of the Spencer 
formula in place of the simpler moving average. This improvement in smooth- 
ing is reflected in the resulting seasonal-irregular ratios and in all the subse- 
quent computations. 

Although the Spencer weighted fifteen-month moving average appears to 
yield a better estimate of the trend-cycle component (as we imagine it) than 
the five-month moving average, there is still the fundamental question of the 
suitability of either for this purpose. As we have said, different types of smooth 
curves will almost certainly be more appropriate for some series. We expect to 
investigate the subject of smoothing the preliminary seasonally adjusted series 
more intensively at a later stage (see Appendix A). 

(b) Extending the trend-cycle curves: The five month moving average of the 
preliminary seasonally adjusted series used in Method I also is defective in that 
it entails the loss of two observations at the beginning and at the end of each 
series. Since the last two months of the series are usually of considerable im- 
portance, Method I fills in these months by extrapolating the seasonal adjust- 
ment factors to cover the missing data. (The beginning of the series is similarly 
completed by symmetry.) This method works well in most series, but, as with 
the extrapolation in Method I of the five-term moving average (described in 
subsection 2, below), it is not optimum when there is a trend in the seasonal 
factors (i.e., a moving seasonal) at the end or beginning of the data. 

Method II attempts to improve upon this extrapolation procedure. Instead 
of extending the seasonal factors, we use an average of the last four months of 
the preliminary seasonally adjusted series as an estimate of the value of each of 
the seven months following the last month of this series. These estimates are 
then used in computing the seven missing values at the end of the Spencer 
graduation. The beginning of the Spencer graduation is supplied in similar 
manner. The Spencer graduations in Chart 1 have been extended to the ends 
of the series. The fit in these series, as in most of the series we have tested, 
appears quite good. 


2. Improvements in Seasonal Adjustment Factor Curves 


Moving positional means of five terms are fitted to the seasonal-irregular 
ratios for each month in Method I: The largest and the smallest ratios in each 
set of five terms are dropped from each computation before the remaining three 
are averaged. These positional means have not always provided smooth curves, 
and occasionally ace not even good fits, particularly at the beginnings and ends 
of series. These defects arise partly from the method used for eliminating ex- 





5 For more information on the Spencer graduation, and on smoothing formulas, generally, see Frederick R. 
Macaulay's The Smoothing of Time Series (National Bureau of Economic Research, New York, 1931), esp. pp. 55, 
121-140. and M. G. Kendall, Advanced Theory of Statistics (London, 1946), Vol. II, Chapter 29. The fifteen-month 
graduation formula used above was first described by J. Spencer in his article “On the Graduation of the Rates 
of Sickness and Mortality,” Journal of the Institute of Actuaries, Vol. 38 (1904), p. 334. 
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Cuart 2. Comparison of seasonal adjustment factors computed by methods I and 
II, sample months of sample series. 


- Ratlos of original observations to 15-month weighted moving average 
x Modified ratios of original observations to 15-month weighted moving average 
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Cuart 2. (concl.) Comparison of seasonal adjustment factors computed by methods 
I and II, sample months of sample series. 


° Ratios of original observations to 15-month weighted moving average 
x Modified ratios of original observations to 15-month weighted moving average 
Seasonal adjustment factors, Method II 
a etetatatatetel ~- Seasonal adjustment factors, Method I (computed from Method II ratios) 
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treme ratios—a method which sometimes eliminates ratios which are probably 
not extreme, or retains ratios which had best be omitted, and thus distorts the 
estimate of the seasonal factor—and partly from the limitations of a simple 
five-term moving average of the seasonal-irregular ratios. 

(a) Isolating extreme ratios: To improve the identification of extreme ratios, 
a control chart procedure has been adopted in Method II. For each month, 
control limits of two “standard errors” are determined above and below the 
five-term moving average of the ratios. (The square of the standard error is 
here defined as the average of the squared deviations of the ratios from their 
corresponding five-term moving average values.) Any ratio falling outside the 
limits is designated as “extreme” and is replaced by the average of the “ex- 
treme” ratio and the ratios immediately preceding and following. If the ex- 
treme ratio is the first ratio for the month, it is replaced by the average of the 
first three ratios for the month; if it is the last ratio, it is replaced by the 
average of the last three ratios for the month. In effect, the weight accorded 
the extreme ratio in subsequent smoothing operations is reduced by two-thirds, 
while the weights of the adjacent ratios are each increased by one-third. This 
procedure is applied separately to the ratios of each month, from January to 
December. 
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The results of the new procedure as compared with the method of positional 
means are illustrated in Chart 2. The effects of centering (in both methods) 
and smoothing (in Method II), which are discussed below, mask the differences 
due to the different treatment of extremes. Nevertheless, it is clear that small 
dips or crests in the lines of smoothed ratios due to the treatment of extremes 
in Method I have now been eliminated (see especially Chart 2, Business Fail- 
ures, April 1952 and September 1951). 

It should be borne in mind that the determination of “extremeness” for any 
ratio depends on the deviations of all the ratios in the series for that particular 
month from their moving average values. The standard error varies from 
month to month within series and between series. At present the data for all 
the years in the series for each month are used as one period for the purpose of 
calculating the standard error. Future experience may prove that two or more 
periods are preferable. Furthermore, our selection of two standard errors as the 
control limits is arbitrary. Tests of these limits now planned may lead to a 
change, probably to a smaller figure, say 1} standard errors, so that more items 
are identified as extremes (see Appendix A). This procedure would involve 
more smoothing of the seasonal-irregular ratios, which would in turn yield 
smoother seasonal-adjustment factor curves. 

A limitation of the new procedure may be mentioned here; since the five- 
term moving average, which serves as the base for the computation of the 
standard error, does not reach to the ends of the series, it must be extrapolated 
if any extremes in the first or last two years are to be identified and properly 
modified. Now, what weight shall be given to the ending (or beginning) years 
in this extrapolation? If the ratios for these years receive large weights, they will 
hardly ever be identified as extreme ratios; if the weights are small, a trend in 
the ratios may be confused with extreme items and the ratio curves may not 
be given their proper slope in the beginning and ending years. This problem is 
difficult to solve. In Method II the following procedure has been adopted: The 
average of the last two ratios for a given month is used as the estimated value 
of the ratio for each of the two years following the last year available; these 
estimated values are then used in calculating the moving average values for 
the last two years. The beginning years are treated similarly. 

(b) Smoothing the fitted curves: Even after adjusting extreme ratios properly, 
the five-term moving average of the ratios for each month sometimes is too 
erratic in its changes from year to year to fit our model of time series analysis, 
which assumes gradual seasonal change from year to year. The five-term moving 
average in Method I is therefore replaced in Method II by a three-term moving 
average of a three-term moving average. This is equivalent to a five-term 
moving average with the weights 1, 2, 3, 2, 1. This smoothing formula appears 
to be superior to the simple five-term moving average in eliminating erratic 
year-to-year changes in direction, while at the same time retaining the smooth 
short-term movements of the ratios. Furthermore, the ratios are smoothed 
after they are centered (i.e., adjusted so that their sum will be 1200.0 for each 
calendar year), rather than before centering, as in Method I, to avoid any 
distortions in the smoothed series due to centering. (It can easily be shown that 
distortions of the centered values will not occur in this case; that is, that 
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smoothing based on linear formulas—of which the unweighted moving average 
is the simplest example—will not change annual totals.) Thus, Method II now 
produces seasonal adjustment factors that are centered and change only gradu- 
ally from year to year. Moreover, an important innovation has now been intro- 
duced: The three-term of the three-term moving average is replaced by the 
three-term of the five-term moving average, whenever irregular movements 
are pronounced.*® Thus, a more powerful smoothing process is used for series 
having large irregular movements (see Appendix A). 

The effects of the revised smoothing formulas for seasonal-irregular ratios 
used in Method II compared with those used in Method I are shown in Chart 
2. The fit of the smoothed lines to the ratios, with smoothing and centering 
accomplished in a mechanical manner, will, of course, differ from any smoothing 
done manually by the usual trial and error process. However, the differences 
in terms of the seasonally adjusted data will probably not be large or significant. 
In general, the fit of Method II is closer to the ratios and is smoother than that 
of Method I. 

(c) Extending the fitted curves: Method I does not take into account obvious 
changing trends and new seasonal factors in obtaining seasonal factors for the 
first and the last few years of each series. In Method I the first seasonal factor 
that can be computed for each month relates to the third year, but is also used 
for the first two years; and the last seasonal factor computed, which relates to 
the third year from the end of the series, is extrapolated to the last two years. 

This procedure—of bringing seasonal adjustment factors up to date by 
leveling off the curves so that their slopes are zero for the recent years—has 
been followed quite generally. It is, however, at variance with a basic assump- 
tion of our method, that the seasonal factors may vary gradually from year 
to year. Where the seasonal is truly constant—that is, where the slope of a 
seasonal adjustment factor curve is zero for several years—all the methods 
that we have considered for bringing the factors up to date give about the same 
results. For cases where the slopes may be significantly different from zero, 
level curves at the beginnings and ends will not measure the full seasonal 
factors; and consequently, the seasonally adjusted series will contain not only 
the trend, cycle, and irregular, but also some seasonal components. 

For this reason, a more sensitive extrapolation procedure has been intro- 
duced in Method II. The seasonal adjustment factor curve is not extrapolated 
directly to the end of the series; instead, the average of the last two seasonal- 
irregular ratios for a given month is taken as the estimated valve of each of 
the following two ratios; and these estimates are used in computing the two 
seasonal factors that would otherwise be missing at the end of the series. (A 
similar procedure is used for the initial years.) The average of the last two avail- 
able ratios, rather than the value of the last ratio alone, is used as the estimate 
in order to avoid any distortion that might result from a highly irregular termi- 
nal ratio. 

* To make this decision, measures of the average amplitude of the month-to-month movements in the trend- 
cycle, seasonal, and irregular components of series have been developed and are used automatically in the eom- 
puter program. For a description of these measures, see Julius Shiskin, “New Messures of Economic Fluctuations,” 


Improving the Quality of Statistical Surveys, Papers Contributed as a Memorial to Samuel Weiss, American Statistical 
Association, Washington, D.C., 1956. 
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This procedure has the advantage of flexibility in the types of curves used 
at the ends of series. On the one hand, where there are strong forces making 
for a constant seasonal pattern, the method will yield level curves at the ends 
of series. On the other hand, where there are strong forces making for a changing 
seasonal pattern, it will permit changes at the ends of series. The leveling off 
of the ratios for years following the last year of actual data will, however, 
exercise a constraint on the extent to which the slopes can change. While this 
procedure makes full use of the available data, it is neutral with respect to the 
question of future turns in seasonal behavior. It does not assume that trends 
will continue up or down or that they will reverse themselves but, instead, 
assumes only that the seasonal-irregular ratios continue at current, levels. In 
the cases where this assumption proves to be wrong, it will not give as bad 
results as would follow from one of the alternative assumptions. 

The difference in our methods of fitting curves to the first and last years of 
the seasonal-irregular ratios may be clarified in the following algebraic terms. 

If X, is the last ratio available, then it is implicit in Method I that 
Xai =Xn~ and Xx42=X,-3, while in Method II we explicitly make X,4; 
= Xny2= 4(Xn+Xn-1). 

It seems reasonable to assume that better estimates of the missing ratios 
will usually be provided by ratios for more current than for less current years. 

Inspection of this approach for our test series indicates that it generally 
gives reasonable results. The results of employing these different methods 
routinely to obtain seasonal adjustment factors for the beginnings and ends of 
series are illustrated in Chart 2. It is clear from the chart that a trend in the 
ratios will now be reflected at the ends of the series and that the resultant 
curves for the terminals of series will be similar to those for the middles. 

It is important to note, however, that this method of adjusting the ends is 
not always satisfactory. Unsatisfactory adjustments will appear more fre- 
quently in series with large irregular components, when the last two ratios are 
both relatively extreme, and particularly when they fall on the same side of 
the seasonal adjustment factor curve. 

The changes in the treatment of the initial and terminal years in Method II, 
as compared to Method I, appear to account for most of the differences that 
have been observed in series adjusted by both methods. Future experience with 
Method II is expected to lead to modifications of this procedure by introducing 
more complex extrapolation methods. 

The technique of using extrapolated average values at the ends of series to 
extend moving averages to cover the full period of the data is employed three 
times in Method II: (1) to extend the weighted Spencer 15-month moving 
aver-ge fitted to the preliminary seasonally-adjusted series (Section III, 1, b); 
(2) to extend the five-term moving average used as a basis for calculating 
control limits needed to isolate extreme ratios (Section III, 2, a); and (3) to 
extend the seasonal adjustment factor curve fitted to the seasonal-irregular 
ratios. A good deal obviously depends upon this technique. It seems reasonably 
safe and is certainly preferable to the alternative assumption that the cyclical 
or seasonal curves level off at the beginnings and ends of series. We recognize, 
however, that we are dealing here with the basic problem of economic fore- 
casting, and that this technique may sometimes lead us astray. 
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3. Extending the Electronic Computer Program to Cover 30-Year Monthly Series 


Method I is limited to monthly series of a maximum duration of fifteen years. 
For most of our users, concerned primarily with postwar data, this has been 
satisfactory; but for groups concerned with longer series, we were only able to 
make this service available in a rather clumsy way by splitting the data into 
segments with very long overlaps. 

The memory capacity of the electronic computing machines for which the 
Method IT program has been prepared does not permit an indefinite expansion 
of the period that can be used. A substantial increase in the number of years to 
be covered would require the use of relatively inefficient techniques and would 
slow down operations. Fortunately, a simple expedient permitted the doubling 
of the maximum number of years included. (Instead of using one computer 
memory position for each monthly figure as in the earlier method, Method II 
puts two months’ data into each position. While this limits the maximum 
number of digits for each month to six, it is, for most economic series, a satis- 
factory upper limit.) Thus, the new method can now be routinely applied to 
any time series from six to thirty years long. For longer series division into 
several overlapping segments is necessary for the present. 


4, Additional Tests 


In the analysis of current economic conditions, a great deal of interest at- 
taches to monthly changes. For this reason a reasonable argument can be made 
that month-to-month changes rather than monthly levels should be adjusted for 
seasonality. Indeed, the well-known link relative method developed by Warren 
M. Persons follows this idea.’ The link relative method, however, lacks the 
flexibility or the simplicity of the ratio-to-moving-average method for com- 
puting moving seasonal adjustment factors. 

To determine whether Method II makes a good seasonal adjustment of 
month-to-month changes as well as monthly levels, link relatives of seasonal- 
irregular ratios were compared with the link relatives of the seasonal adjust- 
ment factors implicitly fitted to these link relatives by Method II. The results 
indicate that the implicit curves fitted to the link relatives of the seasonal- 
irregular ratios are similar in smoothness, closeness of fit and general sweep to 
the curves fitted to the ratios to moving average. Consequently, Method II 
seems to yield a seasonal adjustment of the month-to-month changes of about 
the same quality as the seasonal adjustment of the absolute observations. 
Chart 3 illustrates this point. 

What is the effect of our method of seasonal adjustment upon series that 
have no seasonal component—does our method introduce spurious fluctuations 
in series? To answer this question partially Method II was applied to stock 
prices, which are not considered to have any seasonal fluctuations, and to un- 
employment after adjustment for seasonal variations by Method II. As can be 
seen from Chart 4, the effect of a Method II adjustment upon such series is 
trivial. 





1 See Warren M. Persons, “Indices of Business’ Conditiona,” Review of Economic Statistica, January 1919. 
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Cuart 4. Effect of seasonal adjustment by method II on 
series without seasonal components. 
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5. Conclusions Regarding Method II 


It is difficult to measure objectively the quality of a seasonal adjustment. 
There is widespread agreement, however, that a good adjustment is one that 
minimizes repetitive intra-year movements. While moving average curves 
satisfy this criterion such curves have in the past had limited use for business- 
cycle analysis because they distort or bias the dates of turning points, the 
amplitudes, and the patterns of business cycles, and because there is no satis- 
factory way of bringing them up to date. While it is conceivable that a moving 
average curve that overcomes these limitations can eventually be developed, 
for the present, conventional seasonally adjusted series appear preferable. 

Inspection of the results yielded by Methods I and II for a sample of series 
indicates that in terms of this criterion, i.e., the minimization of repetitive 
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intra-year movements, Method II is the better. The techniques for estimating 
the trend-cycle component, for isolating extreme items, and for smoothing the 
seasonal-irregular ratios for each month are certainly better than the corre- 
sponding techniques used in Method I. The technique for extending the dif- 
ferent moving average curves to the beginnings and ends of series also seems 
better. Comparisons of the net results of all these factors are made in Chart 5, 
which shows the original observations and the data seasonally adjusted by 
Methods I and II for some of our test series. The theoretical advantages of 
Method IT have little impact on these series, except at the beginnings and ends. 
However, where the differences do occur, the advantages appear to be in favor 
of the newer method. 


Cart 5. Comparison of seasonal adjustments by methods I and II. 
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Cuarrt 5. (concl.) Comparison of seasonal adjustments by methods I and II. 
Original observations 
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A comparison has also been made of seasonal adjustments prepared manually 
at the National Bureau of Economic Research, the Office of Business Economics 
of the Department of-Commerce, and the Department of Agriculture, and the 
Method II adjustments for the same series. The NBER adjustments, shown 
in Chart 6, employ stable seasonal factors, with two short periods selected for 
each series; the OBE and Department of Agriculture employ moving adjust- 
ments for the series selected. The differences in the results are small. Where 
differences do appear, Method II usually yields the smoother seasonally ad- 
justed series. It seems plain from these comparisons that Method II can be 
counted upon to yield an adjustment of the same order of quality as the best 
manual methods. Furthermore, this method appears to be of such generality 
that it can make stable and moving adjustments about equally well. 
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Cuart 6. Comparison of manual and electronic computer seasonal adjustments. 
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Cuart 6. (conel.) Comparison of manual and electronic computer seasonal adjustments. 


Original observations 
Seasonally adjusted series, manual 
Seasonally adjusted series, Method II 





Business failures, 
liabilities 





wa 
° 


——~Stable factors, 
NBER 


~ 
ow 





Millions of dollars 














Broiler ch 


Moving factors, 
Dept. of Agriculture 


SpuDsnoy, 


Manufacturers’ sales, total 


nn 
Oo 


De) 
°o 


Moving factors, OBE 


Billions of dollars 


a 






































PPPLETUUUSUENTRATERTOCUCEVECUOVVOCTUCTO CEUNTECUTUTTCUOOVOCHISCTSDTCUUANEUETCRUSULOCOCTONUINCECURECLEUNONTOVEDI DET 
1947 1946 1949 1950 1951 1952 1953 1954 1955 1956 
Ratlo scales 








434 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1957 


A professional review of each Method II adjustment is, however, still neces- 
sary. As in the case of all methods of seasonal adjustment, this method im- 
plicitly makes certain assumptions regarding the nature of the forces affecting 
each series. These assumptions are probably applicable to most series, but not 
to all. For example, it assumes that the relations between seasonal and cyclical 
forces are multiplicative rather than additive. For the comparatively few series 
for which these relations are not primarily multiplicative, poor seasonal adjust- 
ments may result. In the light of current figures that became available after 
some of the adjustments were made, it is also clear that the adjustments at 
the ends of series are sometimes unsatisfactory. There may be other deficiencies 
of which we are not yet aware. Constant vigilance is therefore required. 

That Method II does not always yield good adjustments can be seen from 
the series shown in Chart 7. The Method II adjustment for cotton stocks does 
not smooth out the annual patterns fully, leaving positive or inverted patterns 
of the same shape but smaller amplitude than that of the seasonal factors. As 
can be seen from the chart, a much more satisfactory adjustment was obtained 
by using a stable seasonal index with an amplitude correction. This illustration 
suggests difficulties where the monthly figures for the year (calendar or fiscai) 
are tied together by a single common event (e.g., in agricultural crop series). 

Another type of series for which Method II will not produce a uniformly 
good adjustment is one in which there is an abrupt change in the seasonal 
pattern. The technique adopted for fitting moving averages to seasonal- 
irregular ratios will always yield smooth seasonal factor curves, in accordance 
with our assumption of slow, gradual changes in the seasonal factors from year 
to year. Sudden year-to-year shifts can, however, occur for various reasons, 
for example, as a result of administrative decisions by business associations or 
government agencies. Thus abrupt seasonal changes no doubt occurred in 
some parts of the economy when the automobile industry changed the dates 
for introducing new models from the spring to the fall, and when the govern- 
ment deferred the date for submitting income tax returns from March 15 to 
April 15. 

It is also clear from our studies that the isolation of the seasonal factor is 
suspect in the case of series with very large irregular factors. For this reason 
the Univac program routinely adds constant seasonal adjustment factors and 
corresponding seasonally adjusted series when the average month-to-month 
amplitude of the irregular factor is four per cent or more. 

Experience gained with the results of Method II has led to a program of 
testing some alternative procedures with a view to introducing further improve- 
ments. Thus the present method of obtaining seasonal-irregular ratios at the 
ends of series does not give good results when the last two ratios, whose average 
is used as the estimate for the years following the last one for which a figure is 
available, are both relatively extreme, and particularly when they fall on the 
same side of the seasonal adjustment factor curve. Experiments are being made 
with various alternatives, including averaging more ratios when the irregular 
component is large. A moving average curve, of a period that varies with the 
magnitude of the irregular fluctuations of the series, is planned instead of the 
fifteen-month weighted moving average alone. At present the program provides 
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Cuart 7. Sample of unsatisfactory method II seasonal adjustment: Total cotton stocks. 
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no precise test of the existence of seasonality in a series though some computa- 
tions are made to guide the user in making such a judgment. A test which 
involves correlating the irregular and seasonal components, year by year, may 
be feasible, and statements could be printed with the computations explaining 
whether a seasonal adjustment is necessary and whether the results are satis- 
factory according to this test.* 





* These possible revisions are described more fully in Appendix A. 
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This brief description of changes contemplated is intended to underline the 
fact that while we consider the results of Method II satisfactory for most pur- 
poses, we do not by any means consider them the best attainable within this 
framework. Improvements will continue to be introduced as the need for them 
becomes clear and techniques for making them are developed. 

The direction of these changes will be toward including within the general 
approach a large variety of alternative techniques. Measures of the relations 
among the systematic economic forces characteristic of each series and of the 
relations between these forces and chance forces are now computed. In addition 
the electronic computer program will provide for a larger array of smoothing 
and curve fitting formulas. The appropriate technique for each series will then 
be selected automatically among the alternatives on the basis of the measures 
of the characteristics of each series. There are prospects that different tech- 
niques can even be used automatically for different time periods of the same 
series. As we stated earlier, the present program contains a start toward this 
goal, in that there is no fixed formula for computing the seasonal adjustment 
factors for all series, and that one of three formulas is now selected according 
to the magnitude of the average absolute amplitude of the irregular component 
of the series. 

The Census seasonal electronic computer program appears, however, already 
to have brought us fairly close to a mechanical method of providing on a mass 
basis seascnal adjustments of the quality previously obtained for a small 
number of series by a combination of laborious hand methods and professional 
judgments.® 

The computations of Method II take about two and one-half times as long 
on Univac as those of Method I—2.3 minutes for a ten-year monthly series as 
compared to one minute. While the relative increase in cost for Method II as 
compared to Method I may appear large, the cost of doing the calculations 
involved in either Method I or II on an electronic computer is small compared 
to the cost of simpler methods by conventional means, and a great many series 
can be adjusted rapidly. The necessary computing and printing for 3,000 ten- 
year series could be completed on a Univac system in one week. A large 
volume of data can thus be made ready for further analysis on short notice and 
large-scale seasonal computations that become necessary because of revisions 
in original data can be completed quickly. 


IV. FINAL REMARKS 


(1) The present electronic computer program has been prepared for monthly 
series only. However, experiments conducted at the National Bureau of Eco- 
nomic Research and the Dominion Bureau of Statistics of Canada indicate 
that it can also be applied to quarterly data. Good results can be obtained by 
the following procedure: convert the quarterly series to a monthly one by 
interpolating monthly values in the series, apply the computer program to the 
converted series, then convert the monthly adjusted series back to quarterly 
form. The interpolation can be accomplished very easily by repeating the quar- 

* Several other methods of seasonal adjustment already have been or are being programmed for electronic com- 


puters. So far, however, they have been applied only on a small scale and, therefore, cannot be appraised. Appendix 
B gives a summary description of them. 
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terly figure for each of the months of the quarter. This method, the so-called 
“step method” of interpolation, gives almost the same results as a direct 
quarterly adjustment of the data. 

(2) There are certain desirable adjustments that appear to be extremely 
difficult to make mechanically through the electronic computer program. An 
example is the problem of taking care of gaps in series, unrepresentative periods, 
and highly extreme individual items, such as arise from strikes. These extreme 
items may significantly affect both the trend-cycle curves fitted to the original 
and to the preliminary seasonally adjusted data and also the curves fitted to 
the seasonal-irregular ratios. Even our method of mitigating the influence of 
irregular items in computing the seasonal adjustment factors is sometimes not 
adequate tc take care of such items. One method of handling these problems 
would be to adjust the original observations before putting them into the com- 
puter. Another example is provided by a series in which an abrupt change in 
the seasonal pattern takes place. Such a series might best be handled by sepa- 
rating it into two parts, at the point of the change in seasonality, and processing 
each part separately through the electronic computer. Such manual adjust- 
ments of the original observations would probably give better results than any 
mechanical method that we could devise. 

(3) The writers have encountered, in their discussions with economists, some 
suspicion of the use of computers for economic analysis. There is a lurking fear 
that this highly fascinating new tool may divert us from analysis of real eco- 
nomic problems into the development of more elaborate, more refined, more 
intricate computations. This fear is probably well warranted. The temptation 
to put aside the substantive analysis in favor of the development of new 
methodology must be resisted. 

(4) Others have felt that the application of such complex techniques as are 
involved in Method II to data of the crudeness that is characteristic of some 
economic series results in specious refinements. There is, however, another and, 
we think, better way of looking at this problem. Economic analysis is a search 
for uniformities in economic behavior. The analysis of large amounts of data 
by powerful techniques is more likely to uncover uniformities than the analysis 
of a few series with crude toois. 

(5) As a result of the seasonal work done during the past few years, there is 
now available at the Census Bureau and the National Bureau a depository of 
punched cards containing several thousand economic series. Measures of trend- 
cycle, seasonal, and irregular components of these series, and other new meas- 
ures that have recently been added to our electronic computer program,}® 
could be calculated in a few days. The titles of these series have been punched 
on cards along with several identification codes, such as economic process and 
industry. Various statistical measures and additional codes could easily be 
added—for example, measures of business-cycle conformity and timing and the 
average long-run rate of growth. Through the punched cards, or electronic 
computer tapes based upon them, these data could be organized in many dif- 
ferent ways. Such punched cards would provide the raw material for the de- 





10 See Julius Shiskin, “Electronic Computers and Business Indicators,” Journal of Business, October, 1957; 
reprinted as Occasional Paper 57, National B of E ic R h 
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velopment and testing of new theories of economic fluctuations. They consti- 
tute an unparalleled challenge to the ingenuity and imagination of economic 
statisticians. 

(6) Modern data-processing systems record, store, calculate, compare, 
choose, and print numbers, letters, and other symbols. They perform these 
operations automatically, accurately, and at lightning speeds, but with abject 
devotion to very detailed instructions provided by human beings. While there 
is no doubt that this equipment will eventually be used to proliferate other more 
elaborate measures of economic activities, the mechanical production of such 
new measures is not enough to assure an improvement in our understanding 
of economic events. The fruitfulness of this work will ultimately depend, as do 
all other empirical studies, upon the quality of the theoretical concepts formu- 
lated by economic scientists to organize and analyze the data. 


APPENDIX A 


REVISIONS OF SEASONAL MET“OD II NOW UNDER CONSIDERATION 


Since the completion of the Univae program considerable experience has been gained 
with the results of Method II. On tae basis of this experience, we are making tests with a 
view to revising the electronic computer program. A brief description of each of the con- 
templated tests is given below. The series to be used in testing has been selected with 
the following criteria in mind: (1) differing irregular, cyclical, and seasonal components 
so that the results for series with different types of economic fluctuations will be known; 
and (2) widely used series, so that the substantive meaning of the results can better be 
understood. The five series selected are: total unemployment; railroad freight ton miles; 
residential construction contracts; business failures, liabilities; and Federal Reserve index 
of mining production. 

(a) Variable method of adjusting ends of series: The present method of obtaining seasonal- 
irregular ratios at the ends of series will not give good results when the last two ratios, 
whose average is used as the estimate for the years following the last one for which a figure 
is available, are both relatively extreme, and particularly when they fall on the same side 
of the seasonal adjustment factor curve (see, for example, Chart 2, Business Failures, 
December). Experiments are under way to determine an effective way of handling such 
situations. 

These experiments will involve adjusting the test series for periods which both include 
and exclude data for terminal years; for example, a series for which data for the period 
1940-1956 are available will be adjusted for the period 1940-1950 and 1946-1956. The 
effect of the method of adjusting ends can thus be determined by comparing the adjust- 
ments for the years 1946-1950 when data for 1940-1945 and 1951-1956 are and are not 
used. 

Several different methods of estimating seasonal-irregular ratios for the years for which 
they are needed to bring the seasonal adjustment factor curves to the end years will be 
tested. For illustrative purposes these alternative methods along with the implicit weights 
given in each case to the seasonal-irregular ratios, when a three-term of a three-term 
moving average is fitted to them, are shown in Table A-1. Our present thought is that a 
variable method will prove the best; for example, to average no more than two ratios, as 
at present, when the irregular component is small, and four ratios when it is large. 

(b) Control limits: The selection of two standard errors as the limits for separating nor- 
mal from extreme ratios was arbitrary, in the sense that it was not based on any study 
of the distribution of seasonal-irregular ratios. Now evidence is mounting that these limits 
are too broad—too many extreme ratios appear to be included without modification in 
the averaging for the seasonal adjustment factors. We are planning studies of the dis- 
tribution of seasonal-irregular ratios and tests to determine the comparative results with 
limits of 1 and 14 standard errors. 








n 
a 
o 
x 
& 
= 
= 
= 
Q 
& 
=) 
fy 
a 
° 
2) 
iS) 
— 
vA 
° 
5 
a 
4 
“a 
> 
a 
Zz 
>) 
a 
& 
D 
P 
a 
=“ 








0c Of OZ OZ (I poyyoy) Ay =attry frnyattay ° 





eSviaAy BZulraoyw wis] -eatyq spdung 





6°0 ¥F'0-2°0- ? ‘% S'S 8S FT TO- 20- T0- ONY a MNXY aNX 
Joy onjwa poyg Say “WY “XY Jo 

SON[VA VFVIVAB Zuraoul 03 Poz4zyg oull YSZ * 

gz 9 : : OMY a ONY we (PMY TNX +X) * 
92 Le oe T sth? Sebkitas Sod, ihe So ally? S524 ae 
8% 6S ZS T (II Pome) *AX = AX = (AY +AY)E * 


8°0 





eBVIOAY BUIAOJ UW1a]-9d1Y,J, JO eBvieAY Buraopy wsey-sely J, 





I-N @N &N 61N GN ON N IN @N &N FN GN GN 





IvOX 1Oj OVY IBIX 10} OVY 


0} WaAIE) sYBI0M Bow] 0} UEATD 8343104 Od] poyweW ToNnsjodeyxy 





N ioX 10} 10,98,q I-N I¥OX 10; 10,08, 











(OT sjenbe urns itaq3 os pazsn(pe ov syyZI0M oY} fafquireas st ‘Ny ‘oles IB[NZo1i1-[euosves B Yor JOJ IBVA 4SB] 949 SI AV) 


SHYOLOVA LNAWLSALGV TVNOSVAS DNILNdAIWOO NI SOILVU 
ATAVTIIVAY OL NAAID SLHDIGM LIOITAWI GNV SOILVU UVTINOAUUIM-IVNOSVGS ONILVWILSA JO SGOHLAN 


TV WIavVL 


440 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1987 


(c) Moving averages of seasonal-irregular ratios: Where the average monthly irregular 
amplitude is less than 2, Method II now uses a three-term moving average of a three-term 
moving average, which is equivalent to a five-term moving average with weights 1, 2, 3, 2, 
1; for series where the average irregular amplitude is 2 cr more, it uses a three-term moving 
average of a five-term moving average, which is equivalent to a seven-term moving aver- 
age with weights 1, 2, 3, 3, 3, 2, 1. This weighted seven-term moving average sometimes 
does not turn with the ratios, and, of course, requires more extrapolation for missing 
ratios than the weighted five-term moving average. We are now considering two changes: 
(i) the substitution for the three of a five-term moving average, of a five-term moving 
average with different weight patterns, for example 1, 3, 4, 3, 1—this curve, a member of a 
family of weighted moving averages suggested by Victor Zarnowitz, has the advantage of 
a shorter period involving less extrapolation at the ends and may also be expected to fol- 
low the seasonal-irregular ratios more closely, since the central points have relatively 
more weight; (ii) the use of less flexible curves, possibly straight lines, for measuring the 
seasonal adjustment factor for series in which the irregular factor is very pronounced. 

(d) Variable cycle-trend curves: We are searching for a family of curves to use for series 
with different irregular components. We are considering (i) Robert Henderson’s general 
formula which makes the sum of the squares of the third differences in the weights of the 
weight diagram a minimum for any number of terms desired; (ii) variants of the five-term 
moving average with weights 1, 3, 4, 3, 1: for example, a nine-term moving average with 
weights 1, 3, 6, 8, 9, 8, 6, 3, 1. For relatively smooth series, as indicated by the magnitude 
of the irregular component, these curves would be used in place of the weighted fifteen- 
term moving average (Spencer curve), now used to delineate the cycle-trend component. 
Such curves, being for a shorter period, would involve less extrapolation at the end and 
would perhaps also result in better estimates of the irregular component. 

(e) Correlation of I and S: A common method of judging the validity of a seasonal ad- 
justment is to compare the month-to-month movements in the seasonally adjusted series 
with the month-to-month movements in the seasonal adjustment factors. Following our 
usual thinking, the seasonally adjusted series is considered to be made up of trend-cycle 
and irregular factors. Since a smooth curve, usually the Spencer graduation, is used as the 
estimate of the trend-cycle factor, it may be disregarded for this purpose and the correla- 
tion coefficient between the month-to-month movements of the irregular and seasonal 
factors may provide a test of the validity of the seasonal adjustment. Since a residual 
seasonal will often appear in a positive pattern in some years and in an inverted pattern 
in others, separate correlation coefficients have to be computed for each year. The presence 
of significant correlation coefficients would be interpreted to mean that there is a seasonal 
component in the adjusted series; in this case a statement would automatically be printed 
after the computations indicating that a residual seasonal pattern remains and that 
further work is required.! 

This test would also be applied to determine whether there is a seasonal pattern in the 
original observations. Here the cycle-trend curve would be divided into the original ob- 
servations and the quotient correlated with the seasonal adjustment factors. The absence 
of significant correlation coefficients would be interpreted to mean that there is no seasonal 
pattern in the original observations. In such cases, the statement that the original ob- 
servations have no seasonal pattern would be printed instead of the tables. 

While these changes may appear to be large, we do not believe they would affect many 
series. The Univac program: ng and the experimental work involved is substantial, how- 
ever, and changes cannot, therefore, be introduced in the method for some time. The user 
of Method II should expect further refinements with the accumulation of additional ex- 
perience. Many of these improvements have been suggested by the experience of users and 
further suggestions would be most welcome. 





1 See Arthur F. Burns and Wesley C. Mitchell, op. cit., pp. 54-55. 
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APPENDIX B 


OTHER ELECTRONIC COMPUTER METHODS FOR SEASONAL ADJUSTMENT 


Two additional computer methods for seasonal analysis have been programmed recently 
and applied on a limited scale. A brief description of them follows: 


1. Regression Seasonal Adjustments 


The present writers have prepared and “proved-in” a program for the calculation of 
regression seasonal adjustments. In this method, the original observations and a Spencer 
fifteen-month weighted moving average of the standard seasonally adjusted data in 
Method IT are used as the basis for the computations. Differences between the original 
observations and the Spencer graduation are computed to provide a measure of the 
seasonal-irregular component. Seasonal adjustment factors are then fitted to (a) the 
differences as the dependent variable, and (b) the corresponding values of the smooth 
curve of the seasonally adjusted series as the independent variable. 

The logic of this approach is as follows: Consider a monthly time series for which a 
scatter diagram is drawn so that values for a given month are plotted as the ordinate and 
the corresponding values representing the trend and cyclical components as the abscissa. 
If the original values for the month include neither a random nor a seasonal component, 
all the points fall on a straight line that passes through the origin and has a slope of one 
because the trend-cycle component has merely been plotted against itself. If the assump- 
tions are changed to allow a multiplicative seasonal component in the original values, all 
the points fall on a straight line that passes through the origin, but the slope deviates 
from one. If the original values include an additive seasonal component, the slope of the 
line remains one, but the line no longer passes through the origin. If the seasonal com- 
ponent is partly additive and partly multiplicative, the line does not pass through the 
origin and its slope differs from one. These relations tend to prevail if the series also in- 
cludes a random component. However, the observations no longer fall on a straight line, 
but tend to be distributed at random around such a line. It can be concluded, therefore, 
that the seasonal component for a given month can be measured by the difference between 
the parameters of a fitted straight line and the parameters of a line passing through the 
origin and having a slope of one. 

In order to allow for the possibility of a changing seasonal pattern, time is introduced as 
a third variable. The equation used to derive the seasonal adjustment factors for each 
month is y—r=a+bxz+ct+dzt, where y represents the original observations, z represents 
the corresponding values of the trend-cycle curve, and ¢t represents time. Other variables 
could, of course, be added to this program, for example, variations in the average tempera- 
ture, the number of Saturdays and Sundays in each month, and so on. 

The regression technique for measuring and adjusting seasonal fluctuations comprises 
an entirely different conceptual approach from that followed in Methods I and II. In 
making the adjustments it attempts to take into account certain causes of seasonal varia- 
tions. This is intellectually preferable to the more mechanical approach of the earlier 
methods. On the other hand, the regression technique is very sensitive: The regression 
curves are fitted to approximate measures of the seasonal-irregular factors; minor defects 
of measurement can result in poor regression curves, as was demonstrated by earlier ex- 
periments with the use of deviations from the twelve-month moving average of original 
observations. Furthermore, a method of handling extremes must also be developed for 
this program. While this approach is promising, the writers do not feel that there is as yet 
enough experience with it to form a judgment of its usefulness. 


2. Moving Polynomial Graduations 


A seasonal program has been prepared for the IBM 701 electronic computer following a 
pian developed at the National Bureau by Millard Hastay. While this program, like 
Method II described above, is based on the standard ratio-to-moving-average method, it 
differs in a number of important respects. First, the smoothing of the seasonal-irregular 
ratios for each month is accomplished in the IBM program by moving polynomial gradua- 
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tions. More specifically, for each month a third-degree polynomial is fitted by least squares 
to overlapping eleven-term periods of seasonal-irregular ratios. The smoothed value for 
each point is the central fitted value of its associated third-degree polynomial. For the 
last five ratios for each month, the smoothing is accomplished by taking the last five 
fitted values of the same polynomial, that which is fitted to the last eleven ratios of the 
month. Similar smoothing is made in the first five ratios of the month. Certain constraints 
are also put on the smoothed curves used for the beginning and ending years; for example, 
the first derivative is required to equal zero at the terminal year. (The moving polynomial 
approach is like the use of short-term weighted moving averages in the Univac program. 
The ratios required to bring these short-term averages up to date are obtained in the 
Univac program by taking averages of the last two ratios available as the estimated values 
of each of the following two ratios.) 

To minimize the effects of “extreme” ratios, the period used for the IBM program was 
taken as eleven years. For the test series studied, this sometimes did not give satisfactory 
results, with poor adjustments almost always traceable to extreme ratios. Moreover, the 
method cannot be applied without modification to periods shorter than eleven years. (The 
Univac program uses control limits to identify extreme ratios and then reduces the weights 
assigned to the extreme items.) 

Other differences between the two electronic computer programs are the use in the 
Univac method of (1) an iterative procedure to obtain improved seasonal-irregular ratios 
—that is, a fifteen-term weighted moving average (approximately equivalent to a moving 
third-degree polynomial) of a preliminary seasonally adjusted series is used as the basis 
for obtaining the final seasonal-irregular ratios; and (2) different weighted moving average 
curves, which vary according to the magnitude of the irregular component of each series, 
to obtain the final seasonal adjustment factors from the seasonal-irregular ratios. 

Thus far experience with the IBM 701 program has been quite limited. However, draw- 
ing on experience with the Univac program, Hastay has recommended the direct identifica- 
tion and replacement of extreme ratios, as in the Univac method, instead of the present 
indirect attack on this problem by long-period smoothing. With extreme ratios handled 
directly, polynomial smoothing over shorter periods would become feasible. This improve- 
ment, and the addition to the IBM program of the iterative technique, with a weighted 
moving average to measure the trend-cycle component, would bring the IBM and Univac 
programs closer together. 


TABLE 1 
SEASONAL COMPUTATIONS, METHOD II 
TOTAL UNEMPLOYMENT, UNITED STATES, 1940-1957 


Reproduction of Actual High-Speed Print-Out, Reduced 60 Per Cent 
Original Observations derived from Census Bureau’s Monthly Report on the Labor Force, 
Series P-57 (Thousand Persons) 


}ORIGINAL SERIES SERIES #4406 
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2 RATIOS OF ORIGINAL TO CEDING AND FOLLOWING SERIES #4406 
YEAR Jah FEB MAR APR 1 NOV DEC 
1940 10007 10266 94e2 
194) 9767 10469 . 10169 8942 
1942 2 102e3 10160 989 104.2 9o767 
1943 10942 9202 9706 9606 90-8 
\ou4 ’ 9240 10465 8667 106.4 88.5 
1945 10469 10069 Geb 98.6 9765 
1946 10620 10864 93-0 9446 9769 
1947 10Se5 9409 ll2ea 97-3 89.) 
1948 )17e3 10160 104.3 10202 - 8664 

11065 10166 935.5 96-5 88.5 
10868 100.5 966) 107-4 946) 
10307 10306 9206 Vite2 866! 
10806 9763 OUeT 10%-6 8502 
10006 990) 10620 94-2 9665 
10708 104-3 9Be6 103-6 9140 
10305 10063 10464 105-3 9220 
10169 10365 O46) 120! 69 


1026) 10067 


AVERAGES OF RATIOS SERIES #4406 
Jan FES MAR APR NOV oEec 


107.3 10469 10007 9807 102-4 9166 


3 UNCENTERED 12-MONTH MOVING AVERAGE OF ORIGINAL SERIES #4406 
Year Jan Fes MAR APR Nov OECc 
1940 728 710 
194) 400 
1942 
1o43 
1944 
1945 
1946 
1oa7 
1948 
1949 
1950 
195) 

1952 
1953 
1954 
1955 
1956 
1957 

4 CENTERED 12-MONTH MOVING AVERAGE OF ORIGINAL SERIES #4406 
YEAR OEC 
19460 
194) 

1942 
1ou3 
1944 
1945 
1946 
1947 
1948 
1949 
1950 
195! 

1952 
1955 
1954 
1955 


1956 
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5 RATIOS OF ORIGINAL TO 12=-MONTH MOVING AVERAGE SERIES #44806 
YEAR JAN Fee MAR APR NOV OEC 





1940 
196) 106.3 10363 
1942 11106 11202 
1943 98.7 10202 
1944 100.0 8906 
1945 10166 10106 
1946 1134-9 125-0 
1947 10766 Thle2 
100.8 13000 
Ole! 11508 
1216? 12966 
11467 115-3 
11508 1186) 
12162 11720 
11968 134.4 
11208 11606 


1956 11265 11302 


6 PRELIMIMARY ADJUSTMENT FACTORS 
YEAR Q FEB 
1980 
198) 

1942 

1oa5 

1944 ) 10763 

1945 ! 10968 

1946 1150? 

1oa7 11960 

1948 12207 

)ea9 12148 
1226! 
12007 
12220 
12206 
1226) 
11963 

1986 Vleet 


1957 11M@e? 


7 PRELIMINARY ADJUSTED SERIES 
YEAR Jan Fes 
1960 
1oa) 
1oa2 
1oa5 
1ea4 
1945 
1946 
1947 
19468 
1949 
1950 
1951 
1952 


19535 





2 a 9 


ADJUSTMENTS BY ELECTRONIC COMPUTER METHODS 


8 WEIGHTED 15-MO MOVING AVERAGE OF PREL ADU 


1942 
1943 
1944 
1oas 
1946 
1967 
1ou8 


1ea9 


1946 
1947 
1948 
1949 
1950 
195) 
1952 
1953 
1950 
1955 
1956 


1957 


JAN 


226 


l2e5 
101.8 
10761 

10360 
10606 
114.0 
115-7 
11763 
12502 
11606 
11268 
11269 


11507 


T 


° 


FEB 


675 


132 


7 


255 
22) 


252 


WEIGHTED 


FES 


10207 
10968 
10766 

Pre2 
11463 
11268 
1)207 
13060 
1220 
12202 
118e! 
12269 
12061 
12507 
11866 
115.0 


11463 


© MODIFIED RATIOS+ORIGINAL /WTD 


YEAR 
1940 
194) 

1942 
19043 
1944 
1945 
1946 
1947 
1948 
1949 
1950 
195) 
1952 
1953 
19546 
1955 
1956 


1957 


Jan 


105.0 
11143 
101.4 
11160 
11265 
101.8 
107.) 

1030 
1086 
114.0 
115.7 
117.1 

1396 
11606 
11268 
11209 


115-7 


FEB 


10267 
10968 
10706 
10604 
11463 
11208 
11207 
1206 
12240 
12202 
11861 

12249 
120.1 

12567 
11606 
11500 


114.65 


239 
220 


250 
15-M0 
MLR 
1026! 
100.5 
104.1 
9206 
9702 
10564 
11300 
10569 
11966 
11068 
Vite? 
11068 
107%) 
11306 
11902 
11506 
11169 


108e0 


185 


100 06° 


10207 
9645 
8904 
8765 
9300 
9769 

11060 

107e9 
W707 
Dre? 
946) 
9508 

10802 

10661 

11069 

10064 


10020 


15-M0 MOV AV 


MAR 
1026 
10065 
104.) 

9206 

9702 
10504 
11300 
105.9 
11906 
11008 
W167 
11008 
1076) 
11366 
11902 
11566 
Tle? 


10860 


APR 
10006 
10267 

9665 

89.4 

87.5 

93-0 

9709 
10563 
10769 

9767 

9967 

94.) 

95-8 
10862 
1066! 
11069 
10064 
10060 


22) 
201 


98.0 
94.8 
9002 
88.0 
10le@ 
A565 
9901 
88-7 
8766 
9901 
9103 
8904 
946! 
9120 
9865 
9508 


10166 


98.0 
94.8 
90-2 
88.0 
101.4 
85.5 
9941 
86.7 


67-6 


91e3 
89.8 
94.) 
9140 
9665 
9568 


10106 


353 
315 
178 
173 


10263 
108 02 
11220 
12206 
12262 
12306 
11307 
11563 
10965 
1076! 
107«3 
02 
10502 
1096! 
9820 
10501 


11306 


10263 
10602 
11260 
12206 
12202 
12346 
11307 
11563 
10965 
O76! 
10763 
1162 
10502 
10961 
1041 
10561 


11306 


JUL 


808 


232 
103 


70 


221 
220 
199 
371 
295 
179 


174 


11302 
11063 
12220 
135.0 
127e! 
10506 
10207 
11763 
M126) 
11065 
10868 
10369 
1165 
108.4 
965 
9766 


11061 


11362 
11003 
12240 
12800 
12761 
10506 
10267 
117.3 
W201 
11005 
108.8 
10369 
19465 
108.4 

9665 

9706 


11001 


18) 
172 


349 
254 
255 


AUG 
106068 
1096! 
103.3 
105-0 
1030 
7202 
9461 
9702 
9665 
9508 
9043 
87463 
9320 
64.4 
9208 
8802 


8605 


AUG 
10608 
1096! 
10363 
10520 
10360 

89.8 

946) 

9762 

9665 

95-8 

90-3 

87.3 

93-0 

84.4 

9208 

88.2 

86.5 


210 
208 
396 
262 
16a 
168 
158 
347 


256 


86.) 
9665 
85.3 
91466 
96-8 
11602 
946) 
91.0 
9267 
84.6 
89.3 
87.5 
85.7 
85.5 
69.3 
84.0 


796! 


783 
457 
164 
89 
59 
170 


223 


OocT 
9245 
64.0 
67.5 
8766 
7426 
9168 
8769 
83.3 
We? 
8963 
7729 
8746 
780% 
7309 
8008 
8266 
7505 


ocr 
9265 


87.5 
87466 
64.7 
9168 
8749 
83.3 
7707 
89.3 
7749 
8746 
7865 
7349 
80.8 
8206 
75-5 
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SERIES #4406 
NOV Oec 
763 ™ 
a3) 409 
172 1s9 
83 7 
s7 se 
194 212 
226 22s 
199 198 
219 230 
402 309 
238 227 
aa 180 
186 1s6 
202 233 
327 312 
258 257 
255 254 


SERIES #4406 


Nov DEC 
9502 93.9 
88.2 88.5 
94.8 9566 
85.5 89.6 
87.7 69.3 
89.7 92.9 
85.4 94.2 
81.8 62.8 
83.6 64.5 
84.8 87.5 
9a.) 98.2 
$965 92.8 
69.9 9106 
84.2 990) 
68.4 20 
93.0 94.6 
9762 9766 
SERIES #4406 
NOV vec 
9502 93.9 
88.2 88.5 
94.8 9506 
85-5 89.6 
87.7 89.3 
69.7 9269 
65.4 9442 
81.4 82.8 
83.6 64.5 
84.8 8765 
9a.) 9862 
94.5 9208 
89.9 9166 
84.2 996! 
66.4 91.0 
93.0 9466 
9762 9766 
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ee STABLE-SEASONAL ADJUSTMENT FACTORS SERIES #4406 


JAN FEB MAR NOV DEC 


1109 1156 1062 692 O19 


ee STABLE-SEASONAL ADJUSTED SERIES SERIES #4406 
YEAR Jan FEB 
1940 
194) 

1982 
1ea3 
12448 
tous 
1946 
1947 
1948 
1949 
1950 
1951 
1952 
1955 
1954 
1955 
1956 


1957 aes 


CENTERED RATIOS+ ORIGINAL /Y IES #4406 


year FEB OEC 
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12 FINAL SEASONAL ADJ FACTORS+ 35 MO MOV AVS SERIES #4406 
YEAR JAN FEB MAR APR NOV DEC 
1940 10.3 9149 9109 
1941 1056) 100.5 91.3 9168 
1942 106.3 99.8 90-1 Cir} 
1943 10768 100.4 89.2 64 
1944 10965 10166 67.8 9069 
1945 10760 Vea 104.4 86.6 02 
1946 06.9 11368 10765 85.6 89.0 
ou? 107.0 1608 11007 85.6 89.0 
1948 106.0 11864 11260 86.4 89.2 
1949 109.7 12020 W203 9601 
1950 112.8 12009 art) 9166 
195) 114.7 12167 11260 93.3 
1952 11602 12965 1204 4.3 
1953 11604 V2163 W302 94.5 
1954 11660 12062 11365 - 94.69 
1955 11502 11869 11303 9564 
1956 114.6 11762 V1265 7 9602 


1957 134-2 11569 11063 


FINAL SEASONALLY ADJUSTED SERIES 
YEAR Jan Fes MAR 
1940 

ea) 

1942 

1945 

1944 

1945 

1986 

1947 

1948 

1949 

1950 

195) 

1952 

1955 

1954 

1955 


1956 
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1@ RATIOS+ FiwAL 40) TO PRECEDING AND FOLLOWING SERIES #4406 
YEAR JAN FEB MAR aor JUN OFC 
1940 9443 
194) 98.9 10162 
1962 10269 98-3 
1945 9365 10263 
19484 10466 9565 
1945 10069 10769 
1946 10666 ) 9766 
1987 9263 ’ 10} 06 
1946 99.3 9965 
eu9 9905 

950 ° 9906 
VO3e2 

9607 

9840 

10360 

98.9 

10148 


99-0 


Jan MAR NOV 


100.4 99.9 100.8 


18 UNCENTEREO 12-0 MOVING AVERAGE FINAL ADJ SERIES #4406 
YEAR Jan FEB MAR APR NOV orc 
1940 Pah] 
194) 

1942 
yeas 
1988 
19465 
1946 
1987 

1948 
1989 
1950 
yes! 

1952 

1953 
1958 
1955 
1956 
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6 RATI 


YEAR 
1960 
194) 
1942 
1945 
1966 
1945 
1946 
1947 
1948 
1949 
1950 
195) 
1952 
1955 
19546 


1955 


4957 


Rati 
YEAR 
1960 
1961 

1942 
1945 
1944 
1985 
1946 
1947 
1948 
1009 
195¢ 

195) 
1952 


1955 


955 
1956 


1957 


OS 


Se 


12-0 MOV AVSe 


Jan 


100.6 
100.8 
10067 
100.0 
98.4 
99.0 
99.6 
99.5 
98.5 
10065 
101.8 
100.6 
100.0 
9764 
10164 
10064 


FEB 


100.9 
10164 
10067 
10163 
9864 
9901 
100«0 
9905 
98.9 
10066 
10120 
10066 
100-0 
9802 
10067 


10004 


FINAL ADJ TO ORIGINAL 


MAR APR 
10066 10002 
10069 9907 
10008 10020 
10020 10020 


10000 10102 
9965 1000 
100.0 9955 
100.0 9965 
99.7 1006 
10060 9964 
10005 10065 
100.0 10020 
100-9 10067 
9920 100-3 
100.4 10020 
100.4 10000 


EACH MO TO PRECEDING JANe FINAL ADJ 


Jan 


toa2 


1o98 


1026 


Fes 


MAR APR 
927 o3) 
689 787 
812 763 
907 893 
966 989 

1167 1107 
936 1076 
114) 3) 
116s 1236 
922 882 
66! 798 
909 903 
914 95) 
1233 1248 
966 976 

1006 984 
946 246 


9907 
9963 
10020 
98.6 
103-5 
100-0 
99.5 
99-5 
100.9 
96-8 
10020 
100-0 
100.0 
100.9 
9906 


9906 


942 
995 


1a71 


JUN 


9903 
98.9 
990! 
9865 
10308 
10064 
9965 
10020 
10165 
9664 
100.0 
16020 
10006 
10006 
10020 
10020 


9926 
9906 
9920 
100.0 
10205 
100-0 
99.5 
9965 
100.8 
98.7 
10020 
9968 
100.0 
10066 
9906 


10060 


74 


$92 


1013 
1390 

935 
1036 
1058 


1533 


789 


1040 


1207 
61) 


1068 


an 


auG 
100¢0 


9968 


98-9 
98.5 
1007 
10060 
994) 
991 


100.35 


100.0 
100-0 

98.4 
10069 
10020 
10020 


732 
$\7 


907 


‘ea! 


982 
1073 


16246 


SEP 
10020 
9948 
9965 
96.9 
98.4 
10000 
9966 
99-1 
9866 
100.0 
100-4 
100-0 
100.0 
97.5 
10163 
100.4 


100.0 


sep 


ve! 


1019 
"5 
1098 
1550 
667 
aug 


a3 


1380 
690 


oct 
100-3 
99-8 
9965 
98.9 
100.0 
10000 
10060 
9806 
9867 
100.0 
10064 
10060 
100.0 
97-3 
WOled 
10008 


10068 


626 
“ss 


667 
30S! 


1060 
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SERIES #4906 
NOV OEc 
100.1 1006! 
99.3 9948 
98.8 9944 
98.8 98.8 
98.4 9864 
10065 10000 
99.5 100.0 
99.0 99-0 
99-2 98.8 
100.0 100.0 
100.0 10064 
100.6 100.6 
99-4 10020 
9749 9762 
101.0 10160 
100.8 10064 


SERIES #4406 


NOV 


1140 
1195 

893 
1080 


O€C 


1107 
62) 
1136 
1599 
609 
82) 
652 
1506 


‘26 


1028 





PROBLEMS IN MEASURING LONG TERM GROWTH 
IN INCOME AND WEALTH* 


ALEXANDER GERSCHENKRON 
Harvard University 


«¢ ACCEPTABLE long-term records of national income and wealth and of their 
‘XK customarily distinguished components constitute indispensable minimum 
information in the study of economic growth.” Few would take exception to 
this statement by Simon Kuznets in his “Introduction” to the present volume, 
and this reviewer agrees emphatically. Without knowledge of basic aggregates, 
economic history at best would remain confined to easy but unsubstantiated 
generalizations. Most likely, it would relapse into legal and political history, 
into essays in biography, and into sociological schematism and sociological 
impressionism. In short, economic history would contain everything, except 
one thing—economics. It is another matter that once the pertinent economic 
questions have been posed and the requisite empiricai information has been 
assembled and placed within economically significant frameworks, interpreta- 
tion of the findings inevitably would call for recourse to various non-economic 
factors and accordingly to disciplines other than economics. What is at stake, 
of course, is not professional provincialism, but methodological clarity with 
regard to the specific subject matter of economic history. 

For the most part, the present volume purports to summarize the state of 
our knowledge of long-term trends of national output (and wealth) with regard 
to four major and two smaller countries: the United Kingdom, France, Ger- 
many, Japan, Denmark, and Hungary. Some of the essays, however, embody 
a good deal of original work on the part of the authors. The character and the 
quality of the individual contributions are net uniform. Nor is the availability 
and reliability of the basic data. It is partly for this reason that the high initial 
expectations with which one begins the perusal of the volume are soon tempered 
by disappointments. No doubt much remains to be done before the results can 
be used conveniently and confidently for the purposes of historical interpreta- 
tion within the individual countries; the road to meaningful comparisons 
among them would seem even much longer. 

The first paper by James B. Jefferys and Dorothy Walters (“National In- 
come and Expenditure of the United Kingdom, 1870—1952"’) is modestly pre- 
sented by the authors as a review of progress made. It is more than that, as the 
process of fitting together and reconciling the various existing estimates in- 
volved much additional and original work. 

From a statistical point of view this paper is the most mature among the six 
contributions. Both the income side and the expenditure side of social accounts 
are presented. The discrepancies between them are frankly and carefully dis- 
cussed, and the reader can form an opinion of his own with regard to the re- 
liability of the estimates. It is gratifying to see that after 1890 the disparity 





* An invited review article on Income and Wealth, Series V, Simon Kuznets, Editor. Interuational Association 
for Research in Income and Wealth, 1955. Pp. xiv, 242. 42 shillings. 
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between the two series is reduced to quite tolerable proportions. The last sec- 
tion (III) of the paper contains a brief but not inadequate discussion of the 
methods used in constructing the component series. The weakest point of the 
calculations seems to lie in the conversion of estimates at current prices into 
constant prices of 1912-13. Apparently, no attempt has been made to use 
different deflators for different types of consumers’ goods. Nothing is said about 
the weights employed in the construction of the various price indices which the 
authors had to Jink in order to obtain consecutive price series for the whole 
period. Accordingly. it is not clear at all what weights actually underlie the 
income series allegedly expressed at prices of 1912-13. To obtain an index of 
physical volume at constant prices of one period, the deflator for each year’s 
current values, properly speaking, should be a price index based on weights 
pertaining to that year..In other words, a consistent base year volume index 
requires given year price indices as deflators. In what sense “output values at 
constant prices of one year” as presented here can be regarded as actually 
aggregated on the basis of the price structure of that year is anybody’s guess. 
The comparability among the individual subperiods as well as the rate of 
growth for the period as a whole remains problematic under these circum- 
stances, as the data for the individual years must be subject to a varying degree 
of distortion. The authors might well have included an investigation of the 
distortions inherent in the series in their list of various “gaps” to be filled by 
further study. They are careful to point out that their series is not adequate 
for the purposes of short-term analysis. But as long as we have no idea as to the 
direction and probable extent of those distortions, also the degree of retardation 
or acceleration over longer periods is quite elusive. This is a serious limitation 
on any historical interpretation of the results, although it must be admitted 
that the deflating techniques used in this paper are greatly superior to those 
used in seme other contributions to this volume, which will be discussed 
presently. 

On the other hand, it is to be welcomed that the authors have presented their 
data at current prices for each year of the period under review and have made 
it possible for the interested reader to compute the national income at constant 
prices in the same fashion. The mechanical presentation, by calendar decades 
or quinquennia, be they overlapping or not, is hardly adequate for many prob- 
lems raised in historical analysis, in particular for the all-important problem 
of relation between structural spurts of growth and the intervening cyclical 
fluctuations. Nor is there much doubt that such an analysis would call for 
much more detailed break-downs of growth by production sectors than the 
ones used in this paper to obtain the aggregate income figures. 

From any historical point of view, one must also join the authors in the re- 
gret that their series begins at such a late point in the 19th century. This is 
particularly so if one recalls the lower reliability of the data for the seventies 
and the eighties and the interruptions of continuity by the two great wars. 
Still, this point must not be over-emphasized. There are reasonable limits to 
ingratitude, and this reviewer should not like to cress them. 

Frangois Perroux, the author of the paper on French economic growth 
(“Prise de vues sur la croissance de |’économie frangaise, 1780-1950”) shows 
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much less interest in the description of statistical techniques, and much more 
concern with broader historical problems. In particular, his distinction between 
“active” and “passive” components of national income in the process of its 
growth is very helpful in providing direction and orientation for quantitative 
research. In Perroux’ words, it is the “reasoned history” that must inhale 
meaning into statistical analysis. That in reality there is, and must be, a steady 
interaction, and questions addressed to the facts are just as important aa 
questions raised by the facts, is another matter. Unfortunately, the author’s 
dictum has a special meaning as it is designed to convey his distrust of the 
quantitative data which he makes available: If the latter are at variance with 
what we should expect from general historical knowledge, they should be 
rejected. 

The reader is unable to pass judgment on tis attitude, as the author does 
not allow him any glimpses into the statistical kitchen in which the estimates 
have been brewed, beyond giving references to some previous studies, He is not 
even told just what concept of national income is embodied in the series pre- 
sented, except that the data have been based on production statistics. The 
only statistical point on which the author is explicit is his criticism of the 
methods used to deflate current values to constant price magnitudes. After 
what has been said before, one can only agree that the job of conversion, should 
its results make sense, must be regarded as a much more arduous one than is 
usually assumed. While Perroux does not advert to the basic weighting problem 
mentioned above, in a special section of his paper he vents bitter contempt on 
the impropriety of deflating heterogeneously composed values by some specific 
price index and is eager to show how the use of unsuitable deflators at times 
results in curious and unwarranted irregularities in the deflated series which 
reflect nothing but some spasmodic movements in the deflator chosen. 

It is very useful to have all this said as it should draw much needed attention 
to the problem and reduce the willingness to engage lightheartedly in mechani- 
cal deflations. But Perroux draws a practical consequence. He abandons all 
pretense at obtaining comparable values at constant prices and decides to take 
changes in values at current prices as representing changes in physical volume 
of output. He supports and justifies this procedure by the fact that the period 
between the reign of Napoleon I and the First World War was free from major 
monetary disturbances and accordingly current values are at least as good as 
deflated values for that period. It might be noted that Walther Hoffmann 
arrived at a similar conclusion in his study of British industrial output. The 
trouble, however, is that to reach such a conclusion reliably, one would have 
to have correct deflators and in their absence the actual relation between 
value and volume is quite uncertain. Accordingly, the rates of growth as 
given by Perroux must be taken with extreme caution. The margin between 
maximum and minimum rates of growth shown by Perroux is fairly narrow, 
and it is quite conjectural just hcw much importance can be attributed to 
comparisons among them. 

The rhythm of long-term development, as it emerges from the data, is cer- 
tainly not inconsistent with opinions generally held on the course of French 
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economic history in the 19th century. But as long as quantitative research 
must be tested by vague and impressionistic ideas rather than the other way 
round, the progress achieved cannot be regarded as impressive. 

Perhaps nothing reveals Perroux’s skepticism so clearly as his discussion of 
whether or not French society was economically progressive in the course of 
the last 150 years. He answers the question in the negative. A progressive 
economy, he says, is one in which “technological inventions are translated into 
economic innovations with a minimum of delay and a minimum of social cost” 
involving use of all the human and material resources considered as an entity 
(p. 72). This, he says, was not the case in France. One must wonder as to the 
relevance of such considerations in a paper of this sort. It is not clear at all that 
they are presented to explain why the rate of growth was not higher than it 
actually was, and the value of concepts not adapted to the nature of the 
material presented is dubious. According to the data given, French national 
output increased more than fourfoid between 1825 and 1909, which surely is 
a considerable rate of progress for a “non-progressive” economy. It would seem 
that in a quantitative study, progressiveness—or the lack of it—must, at least 
in the first instance, be conceived in quantitative terms and qualitative con- 
cepts should be brought in to explain the results rather than to negate them. 
If this is not done, it must be taken to mean that quantitative research is not 
yet able to produce trustworthy results. 

The paper on Germany by Paul Jostock (“The Long-Term Growth of Na- 
tional Income in Germany”) is much less reticent concerning the nature of the 
estimates presented. The text contains some discussion of the methods used 
and more is said in a special Appendix. The picture is approximately as follows: 
For the period before World War I detailed computations of national income 
exist for one year only, 1913. In addition, there are official extrapolations for the 
years 1891-1913 on the basis of income tax returns for Prussia and Saxony 
only. For 1860-1890 the author prepared an estimate of his own for five years 
(1860, 1870, 1877, 1883, and 1890) and interpolated values for the intervening 
years. The “benchmark” years’ income was variously estimated. For instance 
the value of net industrial output was calculated as follows: A previously 
available index of gross value of industrial output at current prices (which, 
incidentally, was derived by the multiplication of an index of physical output 
by an index of wholesale prices) was multiplied by a (previously available) 
figure of net industrial output in 1913. The implicit assumption as to the con- 
stancy of the gross-net ratios over 53 years may not be so very implausible 
since the data are at current prices and the higher degree of fabrication is 
likely to have been offset by the relative decrease in prices of value added 
components. The uncertainty, however, about the mutual appropriateness of 
weights in the two underlying series may be much more serious. 

The computations for national income produced by agriculture are even 
cruder. As to the remaining sectors of the economy, their contribution for 
1890 was estimated roughly as a residual by first extrapolating roughly the 
official rough estimate for total national income for 1891 back to 1890. For 
earlier years, “the necessary estimates had to be roughly approximated... ” 
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by relying “on knowledge of general developments” (p. 120). It must be noted 
that what was being estimated in this fashion was said to amount to no less 
than fifty per cent of national income in 1890. 

The data for 1925-1941 are no doubt much more reliable, and, in fact, the 
best of the whole series. Much less so are those relating to the years after the 
last war which have been computed by applying production indices to the 1936 
census. All current values from 1913 onward have been deflated to constant val- 
ues by dint of a single cost-of-living index. The data for the years before 1913 
were deflated by a single wholesale price index. The result is said to be a physi- 
cal output series at 1928 prices. After what has been said before about the 
problem of weight correspondence between the divisor index and the resulting 
quotient index and bearing in mind Perroux’ strictures it should be fairly clear 
that the homogeneity of the physical volume series is a highly doubtful one, 
to say the least. It should not be forgotten that all these problems precede, as 
it were, the real index number problem. It is only after the correct weights 
have been obtained and output values have been consistently expressed in 
terms of a given period that one can begin to wonder what would happen to 
the index and the rates of growth implied in it, and to the component series, 
if another more remote or more proximate period were chosen for the purposes 
of weighting. 

After having presented his data, the avthor goes into exploration of a 
number of interesting and relevant problems. He discusses the meaning of the 
index in terms of various structural changes in the economy, such as the shifts 
away from household production, changes in age distribution of the population, 
expansion of “unproductive” activities, role of military expenditures, and ter- 
ritorial changes. He also tries to adjust his series by taking into account price 
level differentials existing among localities of various size. The adjustment, 
however, is quite mechanical, and the author is well aware of its limitations. 
To assume constant price differentials among towns of different size over a 
period of some eighty years is really quite hazardous. Moreover, since the 
expenditure side of national income is as yet unexplored and the division be- 
tween consumption and investment unknown, the adjusted data refer to per 
capita national income rather than per capita consumers’ expenditures. In 
addition, the paper includes an attempt to estimate the value of increased 
leisure; it investigates the change in income per capita of gainfully engaged 
population; it provides information on the changing ratio of industrial and 
agricultural output; and at least for the post-1913 period, has something to say 
on the changes in distribution of income. 

All these problems, largely posed or inspired by Simon Kuznets’ work are, 
of course, most worthwhile. Yet one cannot shed the feeling that the time for 
discussing them has not arrived. When one considers that the only really con- 
secutive historical period for which long-term change is meaningful is that from 
1860 to 1913; when one recalls the nature of the estimates for that period and 
the degree of their reliability and deflates thereby the elaborations made and 
the conclusions reached, the resulting real income in terms of safe historical 
knowledge cannot be very large. 

The paper on Denmark by Kjeld Bjorke (“The National Product of Den- 
mark, 1870-1952”) provides complete year by year estimates of national in- 
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come from 1870 to 1952. These estimates are divided, on the one hand, into 
agriculture and “other industries” and, on the other, into consumption and 
gross investment. The latter, which does not include working capital, has been 
estimated separately for building, construction, machinery and equipment, and 
transportation. Consumption appears to be computed by adding the net sur- 
plus in the balance of payments to the gross investment figure and then de- 
ducting the resulting sum, representing “gross saving,” from the independently 
made estimates of total national income. The latter have been derived from 
income tax returns adjusted for tax exempted incomes for the years 1870-1920, 
and from production statisties for 1921-1952. It is only from 1930 on that an 
official series began to be computed. There is little opportunity for the reader 
to gauge the reliability of the methods used. What is made clear, however, is 
that the physical output series, i.e., national output in terms of 1929 prices 
cannot be really regarded as such, except with strong reservations. The data 
for 1914 to 1928 have been converted into 1929 values by means of a cost-of- 
living index, except that for the years 1914-1921 an average of a cost-of-living 
index and a wholesale price index was used. No attempt has been made to de- 
flate separately for agriculture and “other industries.” For 1929-1952, the 
existing official series at 1935 prices was reduced to 1929 prices by a cost-of- 
living index. Again, the problem of weighting is shrouded in silence. But the 
most remarkable part of the procedure is the deflation of the pre-1914 figures. 
The 1913 data were converted to 1929 prices via the cost-of-living index. 
Thereupon, the series at current prices for the years 1870-1913 was adjusted 
by the ratio of the 1913:1929 price relatives. In other words, the data for 1870 
to 1913 while adjusted by that ratio, still reflect all the price fluctuations ex- 
perienced during those 43 years! 

The paper on Hungary by Alexander Eckstein (“National Income and Capi- 
tal Formation in Hungary, 1900-1950”) is by far the most detailed piece in 
the volume. It is very explicit and careful in the discussion of the concepts 
employed, and there is no attempt to pass over lightly the techniques employed 
and the procedures followed. This is the only paper in the volume in which the 
conversion into constant prices is discussed more freely. While the actual con- 
version in itself may not be very superior, credit must be given for the attempt 
to use different deflators for different components of national income, both for 
the post-World-War I period and for connecting the pre-war data to the 1938- 
39 price base. But the deflation of the 1900-1914 figures is fairly dismal. An 
unweighted price index, based in part on price quotations from other parts of 
the Hapsburg Monarchy is used for the purpose. 

In general, this is probably the most original among the six contributions. 
The study of Hungarian national income over a long period presents particular 
difficulties because of the drastic territorial retrenchment after 1918 and the 
profound organizational change of the economy and in the prevailing official 
views regarding coverage of national income after the last war. While previous 
studies have been used as far as possible, a great many adjustments and im- 
provements have been introduced by the author. The most independent part 
of the study refers to capital formation for the years 1924/25-1949. Starting 
from an existing monograph for the years 1937-40, the author constructed 
estimates of capital formation for the remaining interwar years and for the 
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first post-war years. As was to be expected, the goodness of individual estimates 
is subject to variation. The important capital formation in industry, for in- 
stance, is based,on much less than perfectly convincing methods and there is 
at least _a hint of_a possible double-counting. Nevertheless, the criteria and the 
methods developed in this study will certainly be basic to all further progress 
in this area. An additional advantage of the study lies in the author’s willing- 
ness to engage in some historical interpretation both with regard to the rate 
of growth by sectors and to the investment trends. Like Perroux, the author 
stresses the role of the “active” or “motive” components of national income. 
He may be mistaken in his belief that the early stress on iron and steel as well 
as machinery output was peculiar to Hungary alone among the European in- 
dustrializations, but the attempt to see changes in national income within a 
broader historical framework is certainly most helpful. All in all, the volume 
would have gained greatly if some of the authors of the other papers had set 
for themselves equally high standards of critical analysis and kindliness to the 
reader. 

The last paper, by Yuzo Yamada (“Notes on Income Growth and the Rate 
of Savings in Japan”) is a brief report on the discrepancies which at present 
exist among a number of different estimates, including those by the author. 
These discrepancies are large indeed for the decades of the 19th century and 
the first years of the present century. Nevertheless, the author presents some 
provisional conclusions, mainly in correction of Colin Clark’s estimates which 
are said to be excessive with regard to the average rate of growth and especially 
with regard to the rate of savings. Since Japanese economists are eagerly en- 
gaged in attempts at reconciling the discrepancies and in improving the nature 
of the estimates, the present contribution must be considered as even more 
provisional than the other papers in the volume. 

It has seemed necessary to dwell at some length on the individual eile 
tions to this volume. The subject matter of these essays is of fundamental 
importance and the suggestive power of a printed figure is great. It cannot, 
therefore, be stressed too emphatically that for the most part the long term 
rates of growth are much too uncertain to allow of any reliable intertemporal 
and, least of all, interspatial comparisons. A great deal of work remains to be 
done and, in particular, the problems of conversion of value to volume must be 
explored much more painstakingly before the results can be used in any re- 
sponsible fashion. What this reviewer has found discouraging about the present 
essays is not so much the present low level of reliability of the deflating pro- 
cedures as the absence of a clear conception of the problems involved and of any 
insistence upon the need for constructive solutions. To turn away in disgust, 
as does Perroux, is perhaps an understandable but not too helpful an attitude. 
One must consider that price and cost-of-living indices so far have been con- 
structed by scholars and institutions interested mainly in changes in prices and 
cost of living for their own sake. Those engaged in investigating long term 
trends in national income and its components cannot hope to arrive at satis- 
factory results unless they embark upon construction of price indices especially 
designed to serve the needs of their work. Naturally, it will be very helpful to 
develop a number of specific deflators for as many sub-groups as possible. Since 
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in practice each sub-group will still contain very many commodities with very 
different rates of change of output and price, the need for correct conversion 
of each sub-group by given year’s deflators will still remain. Moreover, in many 
cases one will have to continue operating with just one or two comprehensive 
price indices. In either situation it may be impossible to construct as many 
price indices as there are years in a long time series. But there is every proba- 
bility that by constructing price indices for a considerable number of short sub- 
periods, each of them weighted by magnitudes pertaining to a year within each 
sub-period, we should come reasonably close to a consistent volume index. It 
would seem that research on long term income estimates has reached a stage 
where this task should receive the highest priority. This at least is the con- 
clusion which emerges rather forcefully from the present volume. It is only 
when the job has been done and meaningful “physical output” series have been 
obtained, that one will be able to proceed to an investigation of the alternative 
vantage points in viewing long-term change. The better understanding of the 
index number problem will not eliminate the arbitrariness of our approaches, 
but it will make it possible to gauge its extent and reveal the historical signifi- 
cance of the weighting choices which are made. 

All this of course must not diminish our gratitude to the editor and the 
contributors. There is no intention to deny the importance of the findings 
summarized here nor to overlook the labor invested and the ingenuity dis- 
played. The volume shows with much clarity where we stand now. The very 
weakness of some of the results can be relied upon to instigate further elabora- 
tions and improvements. Our ability to use past experience for the compre- 


hension and solution of present problems largely depends upon the success of 
this enterprise. 





THE RATIONAL ORIGIN FOR MEASURING SUBJECTIVE VALUES* 


L. L. THurstonp AnD Lye V. Jonsst 
University of North Carolina 


A method is proposed and empirically demonstrated for extending 
Thurstone’s law of comparative judgment so as to transform psycho- 
logical qualities into an additive measurement scale. Application of the 
method yields results supporting the contention that subjective values 
can be measured on an additive scale, an equal unit scale with a 
meaningful zero point. 


THE PROBLEM 


N current scaling methods with the equation of comparative judgment [2, 3] 
| and its variants, the result is a scale difference for every pair of stimuli, ex- 
pressed in terms of an equal unit scale. For a set of n stimuli the subject is pre- 
sented with each pair of stimuli separately. There are n(n—1)/2 such pairs if 
no stimulus is presented with its duplicate as a pair. For each such presentation 
the subject judges which of the pair has more of some attribute z. This attribute 
may be a property of the stimulus such as beauty, desirability, or offensiveness. 
The scale separations of pairs of stimuli are then descriptive of the subjects as 
well as of the stimuli. 

When the scale separations have been determined for a!! pairs of stimuli there 
is no unique zero point. The situation is analogous to that in which we know 
the differences in elevation between pairs of mountains. Such data give no in- 
formation about the elevation of any one of the mountains. Numerical values 
can then be assigned to the stimuli only by setting an arbitrary origin at any 
one of the stimuli such as the lowest stimulus. 

For many investigations this treatment of the scaling problem is adequate 
but there are other psychological problems where it is desirable to have a ra- 
tional origin. For example, we might want to say that the subjective value of a 
certain stimulus is twice that of another stimulus. Such a statement cannot be 
made unless there exists a rational origin for the scale of subjective values of 
the stimuli as to the attribute zx. This paper describes a method of locating the 
subjective origin experimentally. 

This problem is not new. Horst studied the problem [1] with an ingenious 
experimental method that will be described with Fig. 1. Let the vertical line 
in that figure represent the affective continuum. The zero point on this con- 
tinuum represents neutrality or indifference. Any psychological object whose 
scale value is above this point is one that the average subject in the experi- 
mental group considers favorably. Any object below the neutrality point is 
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Fig. 1. Form of stimuli in Horst’s study. 


regarded as unfavorable by the average subject. In order to locate the zero point 
Horst listed a number of events that would be generally regarded as disad- 
vantageous (e.g., “Spend a night in jail.”), and other events that would be re- 
garded as desirable (e.g., “Go to a good musical comedy.”). Then Horst asked 
his subjects to accept or reject each of a number of questions in the form, 
“Would you be willing to have the disadvantage B in order to have the ad- 
vantage C?” If the proportion of subjects who accepted this proposal was over 
.50, the inference was that the positive affective value of C was greater than the 
negative affective value of B. In fact, the equation of comparative judgment 
would give the quantitative difference between the absolute affective values of 
B and C. But this aiso determines the location of the zero point between B 
and C., In the same manner one can make as many determinations of the zero 
point as there are combinations of an advantage and a disadvantage. If the 
zero points so determined are reasonably stable on the scale, their average value 
can be taken as a rational origin for the subjective scale. 

Methodologically this solution is effective and it serves to demonstrate that 
a rational origin for the affective continuum can be experimentally located. 
In practice it has often a limitation in that it is rather awkward to list psycho- 
logical objects of negative value in some contexts. It would be more convenient 
in many situations to deal only with objects of positive value. We consider here 
a variation of the problem in which the zero point will be located with stimuli 
that are all positive in subjective value. 

In Fig. 2 are represented only objects of positive affective value. We show 
the scale locations of three such objects, A, B, and C and of their combinations 
AB, AC, and BC. By the combination AB is meant both A and B, and similarly 
for the other pairs. The subjects are asked to express their preference for each 
pair of single objects such as A or B. They are also asked to express their pref- 
erences for such choices as AB or C. If a subject has a strong desire for the 
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object C, he might prefer to have C rather than the combination of A and B. 
The subjects are also asked to express their preferences for such choices as AB 
or CD. 

In analyzing these preference records, each of the single stimuli is assigned a 
scale value by use of the law of comparative judgment. In addition, each com- 
bination such as AB is treated as a separate stimulus and it is also assigned a 
scale value. The rational origin is a point or the scale so chosen that the distance 
from the origin to the combination AB is the sum of the distances to A and to B. 
Every combination of two stimuli determines in this manner the rational origin. 
It is then a question of experimental fact whether these zero points are clustered 
close together or widely scattered. If the experimentally independent determi- 
nations of the origin are close together and hence consistent, their average can 
be taken as the best location of the rational origin. If an internally consistent 
zero point can be found in experiments of this kind, we shall be able to say 
that one stimulus is, say, twice as valuable subjectively as some other stimulus. 
There are a number of interesting implications of such a finding for several of 
the social sciences. 

There is a fundamental assumption in this reasoning which may be stated at 
the outset. We are assuming that the anticipated satisfaction from ownership 
of both objects, A and B, is the sum of the anticipated satisfactions from A and 
B separately. This is not quite correct as may be seen by pushing the illustra- 
tion further. If the recipient already has twenty birthday presents, he is not 
likely to be so thrilled by the twenty-first present as if that one were the only 
recognition of the day. However, in settir.g up these experiments we are assum- 
ing that in dealing with only two presents, the anticipated satisfactions can be 
regarded as essentially linear for the combinations. Our main object is to locate 
a rational origin and for this purpose we shall use combinations of two presents. 
We need not make the more questionable assumption that the anticipated 
satisfaction from, say, twenty birthday presents is the sum of the satisfactions 
that are associated with each of them separately. We shall find that the additive 
assumption for two stimuli is plausible in terms of experimental findings. 


THE EXPERIMENT 


In designing an experiment to test the hypothesis described with Fig. 2 it 
was decided to use five objects that would be appropriate birthday presents for 
college students who were to be our subjects. In order to describe these objects 
adequately, each item was illustrated with a picture and a catalogue descrip- 
tion. This detailed information was presented on the first page of a schedule to 
which the subject could refer at will while recording his preferences. It was also 
decided that the subjects would rather express their preferences by checking 
pictures than by checking names of the objects. The pictures would probably 
enable the subjects to keep in mind the nature of the merchandise more easily 
than the names. 

In order to insure differentiation in the scale values of these five items it 
seemed desirable that they differ somewhat (but not extremely) in monetary 
value. It was expected that the actual choices would be determined mainly by 
individual interests and habits. Nevertheless, extreme differences in price would 
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probably result in extreme proportions of preferences near unity or zero. The 
scale values would then be unstable and hence less useful in testing the addi- 
tivity hypothesis of this study. An absurdly extreme comparison like “a new 
automobile or a new brief case” would result in proportions of preferences at 
unity. Such a result could not be scaled at all. The five objects seemed to satisfy 
these preferred conditions. 

The verbatim instructions for the schedule were as follows: 


BIRTHDAY GIFT QUESTIONNAIRE 


The purpose of this questionnaire is to investigate preferences for articles which 
students might receive as birthday gifts. The articles are pictured and described on 
the following page. Study them carefully before you read further. 

(In the actual schedule, each of the five following descriptions was accompanied 
by a half tone illustration.) 


Brief case. Rough-grained split leather with disappearing handle. 3-side zipper. Plastic 
coated fabric lining. Brown color. 16X11 inch size. 


Portable 8-speed record player. Plays all record speeds and sizes singly. Full-toned 
4X6 inch speaker. Wooden case, covered with scuff-resistant brown artificial leather. 


Parker “51” pen and pencil set. Easy-press filler fills quickly and easily. 14K gold 
scratch-resistant pen point. Matching propel-repel pencil utilizes 10 to 12 leads on a 
single filling. Lucite plastic body, satin finish, silver color, metal cap. 


Desk lamp. Complete with 18 inch fluorescent bulb. Sturdy steel body with baked-on 
brown enamel finish. 114 inches tall, 19} inches wide. 


Webster’s International Dictionary. Unabridged, completely up to date. 3,350 pages 
of large readable print. Comprehensive sections of new words and phrases, biogra- 
phies, and many other items. 600,000 entries. Bound in buckram. 
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Assume that you do not possess any of the types of articles pictured here. In the 
questionnaire are presented choices among various combinations of the articles. For 
each comparison check the picture of the article or articles you would prefer to own. 
Consider the articles to be gifts for your own personal use; they may not be sold. 

Look at the example comparisons below. (In the schedule, three examples were 
presented and discussed briefly, one illustrating each type of comparison.) 

For each comparison on the following pages, check the article or articles you would 
prefer to own. Remember that you are to judge the articles as if you do not already 
possess any of them. With some comparisons you may be in doubt, but you should 
respond anyway. 

There are a total of sixty-five preferences to be indicated. You may now turn the 
page and begin. 


In the actual schedules in which the subjects recorded their preferences 
there were three types of comparisons. The simplest were of the type “A or B.” 
Here the subject expressed his preference for the single object A or the single 
object B. This type will be referred to as single-single comparison. A second type 
consisted of pairs like “AB or C.” In this case the subject chose between a single 
item and a double item. This will be called single-double comparison. In the 
third type he selected “AB or CD.” This type will be called double-double com- 
parison. 

The schedule was built on five objects which are denoted A) brief case, 
B) dictionary, C) record player, D) desk lamp, and E) pen and pencil set. 
There are ten pairs of single objects. That is then also the number of single- 
single comparisons. In determining the number of single-double comparisons 
we note that for each of the ten doublets there are three possible single stimuli. 
Hence we have a total of thirty single-double comparisons. The number of 
double-double comparisons is the number of possible pairs of the ten doublets 
without duplication of any of the five objects. Hence we have a total of fifteen 
double-double comparisons. Listing these three types we have: 

Single-Single Comparisons 10 
Single-Double Comparisons 30 
Double-Double Comparisons 15 


Total 55 


For each subject we have 55 choices. Ten additional pairs were included for 
checks of consistency but are not included in the present analysis. For each of 
the 55 pairs we tabulated the proportion of the subjects who chose each alterna- 
tive for each pair. There were 194 subjects in the experiment. They were male 
undergraduate students in the School of Business Administration at the Uni- 
versity of North Carolina. In Table 1 appears the proportion of the subjects 
who chose the stimulus at the left over the stimulus at the top of the table. In 
each diagonal cell is entered the proportion .50. 

Since the discriminal dispersions in the equation of comparative judgment 
(the standard deviations of preference distributions, one for each stimulus) are 
different for the three types of comparison it was necessary to treat these types 
separately in scaling. For this purpose it is convenient to denote the three types 
with different subscripts. The plan is shown in Fig. 3. The experimentally ob- 
served proportions Py are recorded in the second quadrant of such a table. 
In every case the first subscript refers to the preferred stimulus so that Py is 
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Fig. 3. Matrix of observed proportions in schematic form. 


the proportion of subjects who preferred the single stimulus 7 to the single 
stimulus k. In this table the subscripts 7 and k refer to single stimuli, and the 
subscripts j and m refer to double stimuli. The 55 independent proportions are 
below the main diagonal of the square matrix of Fig. 3; their complements 
appear above the diagonal of the matrix. 

The basic data for this study are recorded in Table 1. Inspection of this table 
shows that it is incomplete. The reason for this situation is that none of the 
five objects is repeated in the same comparison. For example, there is no entry 
in Table 1 for a comparison like AB against AC because the item A is common 
to the two doublets. The scale separation should be the same as that of B and C. 


TABLE 1 


PROPORTION OF SUBJECTS WHO PREFERRED STIMULI AT THE 
LEFT TO THE STIMULI AT THE TOP. N=194 
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Table 2 shows the normal deviates corresponding to the experimentally ob- 
served proportions in Table 1. These are obtained from tables of the normal 
probability distribution. Because of the restriction to two categories of judg- 
ment Table 2 has a symmetry in that the entries above the principal diagonal 
are the same as the corresponding entries below that diagonal except for re- 
versal of sign. 


DISCRIMINAL DISPERSIONS OF COMPOSITE STIMULI 


Since this problem is concerned with different dispersions of combinations 
of stimuli, it will be convenient to have a general formula for them. The 
equation of comparative judgment, Case IT [2], takes the general form 


Sg — Sr = Teron, (1) 


where (S,—S,) is the scale difference between two sets of stimuli, 2, is the 
normal deviate corresponding to P,,, the proportion of subjects who prefer 
stimulus set g to stimulus set h, and ¢,, is the standard deviation of (S,—S,). 
Equation (1) is based upon an assumption that the reactions or “discriminal 
processes” of subjects to stimuli may be quantified conceptually, ordered along 
a particular psychological dimension, say, of preference, and assigned subjective 
values on an equal unit scale. The distribution of subjective values associated 
with any stimulus, g, is assumed normal in the subject population, with mean 
(or modal discriminal process) S, and standard deviation (or discriminal dis- 
persion) o,. The assumptions underlying (1) are subject to empirical verifica- 
tion, by utilizing the model to find estimates of the conceptual parameters, 
by reproducing from those parameters the proportion of subjects in a given 
sample who chose each stimulus in every stimulus pair, then by testing the 
goodness of fit of reproduced to observed proportions. 

The scale value, S;, of any single stimulus of a stimulus set, may be con- 
sidered the mean subjective value assigned that stimulus by members of the 
subject population. The difference between S; and the subjective value which a 
single subject assigns to stimulus 7 will be called a discriminal deviation. 

Let Yie, Yr, * * * Yna Aenote discriminal deviations for the objects in set g 
and let 21a, 220, * * * 2ma denote the discriminal deviations for the objects in set h. 
Then the variance of the difference between the subjective values of the two 
sets will be 


1 
oo? = — 2 [te + Ye + +++ tyne) — (1a + Zea + - ++ + Zma)]? (2) 
where N is the number of subjects. Assuming all of the single objects to have the 


same variance, we have 
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and, assuming the same correlation between all pairs of single objects, 
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oon? = no*? + n(n — 1)ro? + mo? + m(m — 1)ro? — 2nmro’. (4) 


Let us define as the unit of measurement the common discriminal dispersion 
of single stimuli, 


o=o=l., (5) 
Then, 
oon? = n+ n(n — 1)\r + m+ m(m — 1)r — 2nmr, (6) 


which becomes 


on? = (n +m) +r[n(n — 1) + m(m — 1) — 2nm|. (7) 


Applying this estimation formula to the three cases of this study we have 
1) when n=2 and m=2 
Tim? = 4— 4r Cim = 2/V1 —4r (double-double); (8) 


2) when n=2 and m=1 


on? = 3 — 2r on = V3 — 2r = 2 = Um (single-double); (9) 
3) when n=1 and m=1 
on? = 2 — 2r on = V2V1 — rf = un (single-single). (10) 


Equation (8) is applicable to section (jm) of Table 2 which shows the experi- 
mental data for the double-double comparisons. Equation (9) is applicable to 
section (jk) of the same table which shows the experimental data for the single- 
double comparisons. Because of the symmetry of (7) in n ard m the stretching 
factor %2=Un. In other words, the dispersion is the same for n=1, m=2 as it 
is for n=2, m=1, as was to be expected. Equation (10) applies to section (ik) 
of Table 2 which shows the experimental results for the single-single compari- 
sons. 

From (1), (8), (9), and (10), we can write the equation of comparative 
judgment corresponding to each of the three sections of Table 2. Then 


S; — S: = ut (single-single comparison) ; (11) 
S; — Sy = x2 ~(single-double comparison) ; (12) 
S; — Sm = LjmU2 (double-double comparison). (13) 
It should be noted that the left member of each of these equations denotes the 
difference between two scale values and hence it is immaterial where the origin 
is located. It should also be noted that the scale values are assumed to be inde- 
pendent of their combinations in groups of two. The stretching factors wu, tw, 


and U2 denote the standard deviations in which the normal deviates z are 
expressed. 


RESULTS 


By operating appropriately on the sections of Table 2 according to equations 
(11), (12), and (13), we obtain the matrix of Table 3, where the entry in row g 
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and column Ah is « single estimate of a stimulus difference, S,—S,. For sim- 
plicity, the correlation coefficient r, in equations (8), (9), and (10), was assumed 
to be zero. The matrix of Table 3 remains incomplete. 

A complete matrix of more stable estimates of stimulus differences is gener- 
ated by computing 


ays 
(S, all Sn) _ @ > [(S, ‘i Sy) 2 (Ss = S,)], (14) 


defined only for the g columns of Table 3 in which entries appear both in row g 
and row h; gq may have different values for different combinations of g and h. 
We define an arbitrary origin such that 


> S, = 0. (15) 
h=1 


1 15 
5, = < > ($4 — Sx) (16) 


h=1 


yields an estimate of the scale value of each stimulus set, in terms of that origin. 
The values §, are simply the means of the rows of Table 4. 

To locate a rational origin, we soive for the constant in the ten equations of 
the form 


(San —c) = (Sa — 0c) + (Ss — 0), (17) 


Sap tec = §,4+ Sp. (18) 


The ten estimates for c appear in Table 5. Their mean is —2.33, with a stand- 
ard deviation of .23. 

Subtracting the mean estimate of c, —2.33, from each value 5, of Table 4 
yields the estimated absolute scale value, §,,. In Fig. 4, the 15 scale values are 
plotted along a single dimension, with reference to the absolute origin. Finally, 
Fig. 5 displays the plot of the ten relationships represented by equation (17) 
where §,,=.5,—c. For confirmation of the additive relationships among scale 
values, and of the existence of a subjective origin, the ten points in Fig. 5 are 
expected to cluster about a line of unit slope. Considering the simplifying 
assumptions that have been made, the agreement is reasonably good. The 
additive character of subjective values is approximately confirmed. 

Of incidental interest are the relative sizes of absolute scale values, 5,,., for 
the five single objects. The most preferred object, the record player (C), has a 
subjective value more than four times as great as the least preferred object, the 
unabridged dictionary (B). From the scale values, we would predict that a 
typical subject would prefer the record player to the dictionary, the brief case, 
and the desk lamp. 

Somewhat surprising is the lack of any systematic relationship between pref- 
erence for and cost of the stimulus objects. The prices of the five single objects, 
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Fia. 4. Values of S,, plotted with reference to rational origin. 


taken from the retail sales catalogue of 1951 from which the items were se- 
lected, are: A) brief case, $9.95; B) dictionary, $30.00; C) record player, $29.95; 


D) desk lamp, $7.95; and E) pen and pencil set, $19.75. Comparison of these 
prices with the estimates of their absolute subjective values, 5,,, Table 4, 
demonstrates a negligible relationship. The two most costly items, B) diction- 
ary and C) record player are lowest and highest, respectively, with absolute 
subjective values of .61 and 2.81. Apparently subjects were guided in their 
choices neither explicitly nor implicitly in terms of the monetary values of the 
items. 


TABLE 5 
ESTIMATES OF THE ADDITIVE CONSTANT 








Stimulus Combination c 





—2. 
—2. 
—2. 
—2. 
—2. 
—2. 
—2. 
—2. 
—1. 
—2. 
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Fia. 5. A check on the additive assumption. 


IMPLICATIONS 


This study was designed primarily to test a method of locating a rational 
origin for a subjective preference scale. In a previous study by Horst a method 
was found for locating the zero point by using desirable as well as undesirable 
stimuli, objects, or events. In this study we have found a method of locating 
the zero point in the subjective preference scale by using only objects that are 
desirable so that all of them have scale values above the rational zero point. 
In doing so we have postulated that subjective values can be additive. The 
subjective value of a combination of two objects has been assumed to be very 
closely approximated by the sum of the subjective values of the two objects 
considered singly. We are not assuming that this linearity can be obtained when 
the composite contains many objects. Furthermore we recognize that the de- 
sirability of a pair of objects is not always the sum of their single desirabilities. 
A pair of shoes is more than twice as desirable as a right shoe when the left one 
has been lost. We are assuming that the objects are not dependent in their 
function or desirability and that one does not substitute for the other. This is, 
of course, an old and well-known problem. 

The problem of locating an origin on the subjective preference continuum 
may be regarded as of only theoretical interest but such a judgment is probably 
in error. There are many interesting aspects of subjective measurement with a 
rational origin and we shall indicate a few implications for the social sciences. 
The additive character of subjective values has been indicated. By locating a 
rational origin one legitimately can say that one subjective value is, say, twice 
that of another. Such comparisons are not possible without a rational origin. 

The indifference curves of economics theory are ordinarily regarded as con- 
tour lines on a topographic map. The inside contours are regarded as the dif- 
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ferences in elevation. Not only can the increments in utility be measured, but 
it becomes possible to determine the elevation of utility from a rational origin. 
Combinations of utilities thus become amenable to study, and it should also 
be possible to study profitably the relations between utility and price. Taking 
into account the recognized importance of differing discriminal dispersions of 
stimuli upon prediction of choice [4], such developments should be of con- 
siderable value in market research and consumer preference studies. 

In principle it is possible to obtain the appraisal of an experimental popula- 
tion on a group of stimuli, taken separately, and to predict the proportion of 
that population that would vote for each of several groups of stimuli. The 
stimuli could be political ideas that could be combined into competing political 
programs. A survey of all the separate items could enable us to predict the 
proportion of the population that would vote for one combination rather than 
some other combination of items. The combinations could be studied in order 
to find those that are more acceptable than other combinations. The same type 
of reasoning applies to the study of various social attitudes. 

In psychological studies it is of considerable interest to be able to locate a 
rational origin for the affective continuum of acceptance-rejection or like- 
dislike. The relations between subjective values as determined from a rational 
origin and the discriminal dispersions for the prediction of choice should be a 
fruitful field for further research. 
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APPLICATIONS OF A NEW GRAPHIC METHOD 
IN STATISTICAL MEASUREMENT 


Jacos MINcER 
City College of New York 


INTRODUCTION 


N a recent address delivered to the Royal Statistical Society E. 8. Pearson 
made a strong plea for a renewal of interest in the geometrical approach in 
the presentation and analysis of statistical problems. In Pearson’s words: 
. .. appropriate methods of visual presentation can play an important part in helping 
the statistician in ways such as these: in understanding the meaning of his arithmeti- 
cal results; in avoiding mistakes through lack of fit of his models; in saving time; 
and in providing what is often the best means of making clear his methods of analysis 
to the non-statistician.' 


By virtue of simplicity and generality of the principle it supplies, a note by 
8. I. Askovitz which appeared in Science early in 1955 promises to become a 
new starting point toward the development of a “geometry of statistics.”? While 
ad hoe graphic procedures have been used in statistics all along, they involve 
either theoretically crude approximations, “free-hand drawing,” or specially 
prepared scales (nomographs). The new method requires no special scales and 
leaves no room for “free-hand.” The only practical limitation on its theoretical 
precision is the sharpness of eye and pencil. 

In his note, Askovitz presents a method for determining the mean value of n 
observations. While the need for labor saving or for visual demonstration is not 
particularly great in the computation of a simple arithmetic average, several 
applications to the calculation of other measures, most of them of importance 
in economic statistics, will illustrate the fruitfulness of the method more strik- 
ingly. 

We shall start with an exposition of the Askovitz method and show how it 
can be applied to such diverse problems as calculation of average deviations, 
geometric means, factorials, means of frequency distributions, Gini concentra- 
tion ratios, moving averages, and seasonal adjustments. 


THE ASEKOVITZ METHOD OF AVERAGING 


To find the average of n values (say A, B, C, D, E in Fig. 1) proceed as fol- 
lows: Place the observations at horizontally equal distances. Let one half of 
this distance constitute the X-unit of scale. Starting from A move along seg- 
ment AB one horizontal scale unit to the auxiliary point b. Next move another 
X-unit from b along bC to point c. Continue on segment cD to get point d, and 





1 Pearson, E. 8., “Some aspects of the g try of statistics,” Journal of the Royal Statistical Society, (A), Part 
II, (1956), 125-46. 

? Askovitz, 8. I., “Rapid method for determining mean values and areas graphically,” Science, Vol. 121, No. 
3137, Feb. 11, 1955, pp. 212-3. Another important contribution by the same author with a potentially wide range 
of applicability was recently published in this Journal: 8. I. Askovits, “A short-cut graphic method for fitting the 
best straight line to a series of points according to the criterion of least squares,” Journal of the American Statistical 
Association, (1957), 13-7. 
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finally on dE to get e. If more observations are to be included in the averaging 
simply continue the process until the last point is reached. The ordinate of the 
last point so obtained is the mean of the ordinates of the original observations. 

Since the ordinate ef the (n—1)th auxiliary point is the mean of the ordinates 
of n original observations, the procedure outlined amounts to a successive 
averaging process in which (n—1) observations are averaged to obtain a mean, 


Value of observation 








Scale units 


Fia. 1. Finding the mean value. 


say M(n—1) which is in turn averaged with one additional observation y to 
yield an average of n observations M(n). It is necessary, however, that the 
following relation holds: M(n)=(n—1)/n. M(n—1)+1/n-y, M(n) being a 
weighted average, with weights in proportion 1:(n—1).* That this condition is 
fulfilled by the geometric construction is easily seen: Let the abscissa of A be 
zero. Then the z-coordinates of points B, C, D, E are exactly double of those 
of b, c, d, e respectively, and the horizontal projections of bB, cC, dD, eH are 
exactly equal to the X-coordinates of b, c, d, e, hence to the sequence of integers 
1, 2,3, ---. Clearly, points b, c, d, e divide the corresponding segments AB, 
bC, cD, dE in the required proportions 1:(n—1), where (n—1) takes on succes- 
sive values 1, 2, 3, - - -, so as to constitute successive averages of the corre- 
sponding sets (A, B), (A, B, C), (A, B, C, D), and (A, B, C, D, E). 





* The relation above may be rewritten to read that the average of n observations may be found by taking the 
average of n —1 observations and adding one nth of the differenee between the nth observation and the mean of the 
n—1 observations. See W. Allen Wallis and Harry V. Roberts, Statistics: A New Approach, The Free Press, Glencoe, 
ILL, 1956, p. 225. This is perhaps the simplest way of looking at the construction in Fig. 1. 
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APPLICATIONS 

(a) Average deviations 

After the mean of n observations is found, shift the X-axis to pass through 
the mean (e.g. point e in Fig. 1). Next transfer all original poiats below the 
new abscissa to the same distance above it, so that all n points now have posi- 
tive ordinates with respect to the new axis. A repetition of the averaging pro- 
cedure on these points yields the average deviation of the original set of ob- 
servations.‘ 


(b) Geometric means and factorials 


It is obvious that the simple averaging procedure when carried out on semi- 
logarithmic paper (logarithmic Y-scale) yields geometric means. A special case 
of a geometric average is the expression ~/n!. Computation of the factorials 
merely requires raising of this expression to the nth power (or multiplication 
by n on the log. scale). Similarly, the calculation of combinatorial coefficients 
is facilitated: Let the symbol C," denote the number of combinations of r out of 
n. C,.=n-(n—1)-(n—2) ++ + (n—r—1)/r-(r—1)-(r—2) «+--+ 1. Hence +/C,* 
can be graphically obtained on semi-log. paper as the vertical distance between 
two means, one of the factors in the numerator and the other of the factors in 
the denominator. 


(c) Means of frequency distributions 

The Askovitz method of determining mean values is tantamount to approxi- 
mate integration. In order to estimate the area under a curve it is merely 
necessary to mark off, along the curve, points equally spaced horizontally, and 
apply the method to these selected points to obtain their mean ordinate. The 
product of this mean y-value and the range of z-values is the estimate of the 
area. The accuracy of the result should, of course, increase as the subdivisions 
are made finer. 

Thus a graph of a cumulative distribution (a “less than” ogive) can be used 
to estimate the mean of the distribution in the following fashion: 

The total population (N) is divided into several equally numerous groups 
(quintiles are used in Fig. 2, but deciles would be better). This provides for 
equal spacing on the Y-axis. The approximate average X-value of each group 
is taken as usual at midpoints of the Y-intervals (points 1, 2, 3, 4, 5). The 
simple average of these X-values arrived at by the Askovitz method is the 
approximate mean of the distribution (OM in Fig. 2). Note, however, that in 
this case the averaging is carried out on the abscissas, and not on the ordinates 
as in the previous examples. With this proviso, auxiliary points b, c, d, e serve 
the same purpose as in Fig. 1. Naturally, the approximation improves as the 
number of quantiles is increased. 

Since X; in Fig. 2 is by construction the median of the distribution, the graph 
also enables us to evaluate the skewness in it in terms of the differece between 
the mean and the median. Furthermore, when the X-scale is logarithmic, the 
same procedure yields a geometric mean of the frequency distribution. 





¢ A similar and in some ways more elegant procedure was used by Askovitz in finding the mean of the absolute 
values of the vertical deviations of the original points from the least squares line in op. cit., Journal of the American 
Statistical Association, p. 17. 
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Population quantiles 
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Fig. 2. Finding the mean of a frequency distribution. 


(d) Lorenz curves and Gini concentration ratios 


A Lorenz curve can be graphically constructed from a cumulative distribu- 
tion graph. The successive ordinates of the Lorenz curve of the distribution in 
Fig. 2 corresponding to the successive 20%, 40%, 60%, 80%, and 100% of the 
population must be proportional to >>{ X; (i=1, 2,---, 5 in Fig. 2). The 
sum ba # X; can therefore be given the value 1 and used as height of the 
square in Fig. 3. 

Whether or not constructed from the original frequency distribution, given 
a Lorenz curve the Gini concentration ratio is readily determined graphically: 

The concentration ratio R is defined as the ratio of the area between the 
Lorenz curve and the diagonal AC (Fig. 3) to the area of the triangle ABC. 
Since AB=1, the area under the curve is found by averaging the ordinates of 
the midpoints of quantiles (points 1, 2, 3, 4, 5 on the Lorenz curve). On Fig. 3 





§ For example, if Fig. 2 were a distribution of income, the fraction of total income received by the lower 40% 
of income recipients would be 


2(X; + Xs) ¥:+Xs 
M Xi+ 334+ %4+X4+X 
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Fic. 3. Finding a Gini concentration ratio from a Lorenz curve. 


the average is graphically found to be Oe=BN. Since the concentration ratio 
R=(4—BN)/}=1-—2BN, it is sufficient to mark off NK =BN to find R=CK. 


(e) Moving averages*® 


An average does not depend on the order in which observations are averaged. 
Hence in Fig. 1 the same result should obtain whether the averaging of (A, B, 
C, D, E) proceeds “from left to right” or “from right to left.” 

In Fig. 4 the averaging is performed from both directions so that auxiliary 
points d’, c’, b’, a’ correspond to b, c, d, e as successive averages, and, in fact, 
a’ =e. Thus d is an average of 4 successive points (A, B, C, D) and the mean of 
all 5 points is on the segment dZ. Similarly b’ is an average of (B, C, D, Z) and 
the same mean of the 5 points is also on the segment b’A. This observation 





* The procedure was developed independently by Askovits (personal communication). 
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Fig. 4. Constructing moving averages. 


enables us to find b’ without working “from right to left” since it is found merely 
by extending the segment Ae one X-unit to the right. 

The reader will have noticed by now that the reaching of b’ from d in our 
example was a process of tracing 4-point moving averages. We can, therefore, 
summarize the procedure of tracing (n—1)-point moving averages as follows: 
Find the average of the first (n—1) points, say at d (Fig. 4). Connect d with 
the nth point (2) to obtain the average of the first n points at e. Connect e 
with the point to be dropped (A) and extend the line one unit to the right to 
obtain b’, the moving average of (B, C, D, EZ). To get the moving average of 
(C, D, E, F) proceed the same way, that is join b’ with F to obtain f, the average 
of (B, C, D, E, F), next connect f with the point to be dropped (B), extend 
Bf to obtain g, the moving average of (C, D, E, F). 

The remarkable feature of this procedure is that it simultaneously traces 
(n—1)-point as well as n-point averages (d, b’, g are 4-point, and e, f are 5-point 
moving averages in our example). 


(f) Seasonal indexes and adjustment of time series 


The method for tracing moving averages as outlined in the preceding section 
leads to a very efficient graphic procedure for extracting seasonal indexes and 
performing seasonal adjustments on time series. The procedure follows the 
standard “ratio-to-moving average” method of seasonal analysis, with two ex- 
ceptions: (1) geometric, rather than arithmetic averages are used to smooth 
the original series, (2) geometric averages of ratios-to-moving average are 
utilized to obtain preliminary seasonal indexes. These exceptions are introduced 
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for the sake of convenience.’ In some cases advantages may be claimed for 
these exceptions also on theoretical grounds.*® 
The step-by-step procedure in a seasonal analysis of an n-years series of 
monthly data follows: 
1. Plot the original observations on semi-logarithmic paper. 
2. Trace 12-month moving averages as described in the preceding section. 
3. Separately, for each January, February, etc., mark off the differences be- 
tween the original observations and the corresponding moving average,’ 
and plot these distances above (and below) a baseline = 100 conveniently 
located on the lower part of the chart. Thus for each month we have a 
contiguous set of n—1 points (ratios-to-moving average) equally spaced 
horizontally. 
. Graphically average the n—1 values for each month to obtain 12 pre- 
liminary seasonal indexes. 
. Adjust the 12 indexes to average 100 per cent: Get the sum! ( >>) of the 
12 preliminary indexes. The correction factor 1200/ }. is then read off the 
log scale as a difference, which is then applied to lift or lower the whole 
set of preliminary indexes. 
6. To deseasonalize the original series graphically subtract the final indexes 
obtained in step 5 from the respective months. 
The comparative efficiency of the graphic seasonal analysis was tested on a 
textbook example. Using the standard procedure and a Monroe calculator the 
operation, including plotting of original and adjusted data (a time series of 8 


years, monthly) took over 4 hours. The same job was done graphically in less 
than 2 hours. The maximum difference in numerical results was about 2% 
which may be partly an error due to imprecision in graphing and partly no 
error at all, but a difference due to differences in method. 





7 If it is desired to follow the standard procedure without exception, the 12-point smoothing operation can be 
performed on an arithmetic chart and the graph then replotted on a semi-log. chart. 

* For an argument in favor of “logarithmic 1 ind ” when Pp ts of time series are assumed 
to combine in a multiplicative fashion, see Cowden, D. J., “Moving seasonal indexes,” Journal of the American 
Statistical Association, 37 (1942), 523-4. 

* When the moving averages are connected by a line graph they become properly centered. 

10 This is the only step at which a negligible amount of arithmetic is indicated. 








GRAPHIC COMPUTATION OF R,.23 
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Lord! recently published a nomograph “for calculating a multiple 
correlation coefficient (R123) from the zero-order correlations (riz, ris, 
and r23).” His paper reminded us that in 1947 we developed a graphic 
method of computing R.23. If the sole object is speed of computation, 
the advantage would doubtless lie with Lord’s nomograph rather than 
with our graphic method. But we might claim two possible advantages 
to the method outlined in this paper. First, the computation can be 
carried out on ordinary graph paper. Second, we believe that our graphic 
method can help many students to understand the basic relationships 
between the zero-order correlations and the multiple-correlation co- 
efficient. 


THE METHOD 


© EXPLAIN our graphic process, we shall present two diagrams, based upon 
Tove of Lord’s examples (op. cit. p. 1074). In each diagram, we start by 
measuring riz and rz; along the z-axis, and 7:3 along the y-axis. If any of these 
values are negative, we follow the usual conventions. Thus, in example 1, ri; is 
positive, so we measure it upward along the y-axis. But, in example 2, ris is 
negative, so we measure it downward along the y-axis. 

Next we erect a line perpendicular to the z-axis at the point, rz. Also, we 
draw an arc of a circle with a radius of 1.0. The distance along the perpendicular 
from the z-axis to the arc is designated as b. From the intersection of the per- 
pendicular and the arc, we draw a line to the origin of the diagram. Along this 
line we measure rj3. This can be done conveniently by drawing an are with riz 
as a radius, as shown in the diagrams. Finally, we draw a line from this point 
to riz on the x-axis. This is the line marked a in the diagrams. 

The multiple-correlation coefficient is given by 


Ri .23 = a/b, (1) 


as we shall soon prove. We can measure the distances, a and b, on the diagram 
and compute the ratio. But we can also use graphic division, as shown by the 
dotted lines. First, draw the horizontal dotted line touching the top of line b. 
Along this line, measure the distance a. (A compass can be used here.) From 
this point on the horizontal dotted line, draw another dotted line through the 
origin of the diagram. This line is extended to cut a horizontal line representing 
y=1.0. The intersection is a/b units to the right of the z-axis. Since a/b = Rj .23, 
this gives us the measures we want. 





1 Frederick M. Lord, “Nomograph for Computing Multiple Correlation Coefficients,” Journal of the American 
Statistical Association, 50 (1955), 1073-77. 
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PROOF 


We seek to measure: 





Vri2® + ris? — 2risP shes 
vi-w 
The length of line b in our diagrams is obviously 





Ri.2 wis 


b = 1 — rg? 


by the Pythagorean theorem. It is one leg of a right triangle. The other leg is 
Tz, and the hypotenuse is 1.0. 

Perhaps it is not so clear that the length of a measures the numerator of (2), 
But if z and y are two sides of any triangle, and if @ is the included angle, the 
third side of the triangle is 


z= J/2z'? + y? — 2zy cos 8, (3) 


by a well-known principle of trigonometry. 

In our diagrams z= rz. and y= 113. Moreover, cos 6 = 123, since 23 is the adjoin- 
ing side of a right triangle with a radius of 1.0. By definition, the cosine of 6 is 
T'23/ 1 0 = 193. 

This completes the proof that the ratio a/b in our diagram measures R, 23. 

What about the graphic division of a by b, as shown by our dotted lines? 
The dotted lines, together with a segment of the y-axis, form two similar right 
triangles in each diagram. Consider the two legs of each triangle. The legs of 
the smaller triangle are a and b. Suppose we call the corresponding legs of the 
other triangle p and 1.0. Then, because the triangles are similar, 


a/b = p/1.0 = p, 
so the distance, p, measures the ratio, a/b= Ry 23. 





FINAL COMMENTS 


Our diagrams can be drawn quickly on ordinary graph paper. Thus, the 
statistician does not have to keep on hand a specially prepared nomograph. 
But if he should need to compute large numbers of multiple-correlation co- 
efficients, he might perhaps find it useful to prepare special paper. In addition 
to the usual rectilinear scales for z and y, he would draw a family of concentric 
circles centering on the origin of the diagram, with radii varying from 0.0 to 1.0. 
The only advantage of this special paper would be that 73 could be plotted 
directly, without using a compass. This, too, would be a nomograph, although 
it would be considerably different from Lord’s. 

We confess some doubt concerning the practical value of any sort of graphic 
computation of R,.:. Numerical computation takes little time on a modern 
calculating machine. Also, the statistician usually is interested not only in 
R23, but also in regression coefficients and their standard errors. These can all 
be computed quickly on the calculating machine, together with the value of 
Ri 23. 

The main value of our graphic analysis probably is in helping students under- 
stand the relationship between R,.2,; and the zero-order correlations. 
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GRAPHIC DETERMINATION OF 
MULTIPLE CORRELATION COEFFICIENT 


EXAMPLE IN WHICH [f= 0.52, f,,; 20.22, f,, = 0.40 
(COMPUTED R, », = 0.52) 
Y 
1.0 





6 8 1.0 
x 


EXAMPLE IN WHICH r,, = 0.48, [,,; =-0.22, f, = 0.18 


(COMPUTED R, »5 = 0.57) 
Y ad * Ri 2, 
1.0 
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CONFIDENCE INTERVALS FOR THE PRODUCT 
OF TWO BINOMIAL PARAMETERS* 


Rosert J. BuEHLER 
Iowa State Colleget 


Many more-or-less reasonable solutions to the problem of statistical 
estimation of the product P,P: of two binomial parameters by con- 
fidence intervals can be given. After specializing to the case where 
P, and P: are much less than 1 and to estimation by “one-sided” in- 
tervals it is shown that a unique solution is obtained when one assumes 
a certain set of inequalities and then requires the intervals to be as short 
as possible. Tables of intervals for 90 and 95 per cent confidence levels 
are presented based on reasonable sets of inequalities and on a Poisson 
approximation to the binomial. 


1. INTRODUCTION 


GENERAL problem of practical importance is the following: Suppose a com- 

plex mechanism of some sort (electronic apparatus, aircraft, etc.) is built 
up from several components. It will be known which combinations of these 
components must function successfully for successful operation of the entire 
unit (some components may serve parallel functions so that a single failure 
will not be disastrous). It is assumed that for each component data are avail- 
able for estimating the probability of failure. From such data, what statistical 
statements can be made for the entire unit? For general planning purposes a 
point estimate of the probability of failure may not be sufficient; some sort of 
“confidence” statement may be desirable. 

This problem is investigated here in terms of confidence intervals in the sense 
of Neyman [6]. The analysis is specialized to small probabilities of failure and 
moderate sample sizes since these are believed to be of the greatest practical 
interest. The results given here actually represent only the first step toward an 
answer to the general problem since the numerical work has been carried out 
only for a system of two elements which are independent in the statistical sense 
and which serve parallel functions (either must succeed). The general case in- 
volves the estimation of sums of products of binomial parameters; further 
complications would arise if one did not assume statistical independence. 

Tables for obtaining upper confidence limits for a single binomial parameter 
and for the product of two binomial parameters are given in Section 2, with 
examples of their use. The theoretical basis for the tables is described in Sec- 
tions 3 and 4. Section 5 discusses very briefly the use of an auxiliary random 
variable to obtain shorter intervals. In Section 6 the calculation of the tabu- 
lated values is described. Section 7 shows how shortest one-sided intervals sub- 
ject to a system of inequalities may be defined for an arbitrary discrete dis- 
tribution. 





* I wish to thank Mr. J. M. Wiesen for suggesting this problem. 
+ This work was started at Sandia Corpcration and was completed at the University of Wisconsin Naval 
Research Laboratory with the assist of Contract At(11-1)-298 of the U.S. Atomic Energy Commission. 
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2. TABLES FOR ESTIMATING P AND P,P; 


Suppose that an effectively infinite population has been sampled and that k 
failures are observed in a sample of n. The quantity to be estimated is P, the 
fraction defective in the population. Estimation by interval may be accom- 
plished by finding a set of numbers c,(k; a), (k=0, 1, +--+, n), satisfying 


Prob {0 < P <¢,(k;a)} >a for allO0 < P <1 (1) 


Here a is the confidence coefficient, 0<a<1. In choosing zero as the left end- 
point of the interval we restrict ourselves to “one-sided” intervals in keeping 
with our special interest in cases in which P&1. The numbers c,(k; a) are called 
a system of “upper confidence limits” for estimating P. When such a system of 
numbers is used consistently, it is assured that in the long run the inequality 
P<c,(k; a) will be satisfied at least 100a per cent of the time whatever the 
true values of P may be. 

The conventional solution for c,(k; «) is discussed in Section 3. By making a 
Poisson approximation to the binomial it can be shown that for large n, the 
conventional upper confidence limits vary inversely with n for fixed k—that is, 
nc,(k; «) approaches a limiting value as n increases. Such limiting values are 
given in Table 1; to find the upper confidence limit one simply divides the 
tabulated value by n. A few examples of the accuracy of the approximation 
are given in Table 2. Since the approximate values are larger than the true 
values the validity of the inequality (1) is further ensured when the approxi- 
mation is used. 

Suppose that two elements from a population whose fraction defective is P 
make up a parallel system, that is, either element must succeed for success of 
the system. Assuming statistical independence, the failure probability of the 
system is P*. It can be shown directly from the definition of confidence intervals 
that the numbers [c,(k; «) ]? furnish a system of upper confidence intervals for 
estimation of P? with the same confidence coefficient a. Thus Table 1 may also 
be used for such a parallel system; and one may treat more than two elements 
similarly. 

If two parallel elements are taken instead from two distinct populations, 
then the problem is more difficult. Let us suppose that the two populations 
have been sampled and that k, failures are observed in a sample of n, from the 
first population and that k, failures are observed in a sample of nz from the 
second. One would like to find a set of numbers ¢n,.n,(k1, ke; @) having the prop- 
erty that 


Prob {0 < PiP2 < Cayng(i, ko; @)} > a for allO < Pi, Ps 


<1 (2) 
A particular solution is discussed in Section 4 which is the basis for Table 3. 
Like the values in Table 1, these numbers are based on a Poisson approxima- 
tion. The accuracy is believed to be adequate for practical applications when- 
ever both sample sizes exceed 40, and the sign of the error is the same as indi- 
cated in Table 2. To obtain upper confidence limits from the tabulated values 
it is necessary to divide by the product mn of the sample sizes. 
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TABLE 1 


VALUES OF ne,(k; a) OBTAINED FROM 
THE POISSON APPROXIMATION 








ncn(k; a) 
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TABLE 2 
ACCURACY OF THE POISSON APPROXIMATION 








C100(k; 0.9) 





Approximate value 


from (6) and (7) Error 


True value from (3) 





0.02276 
0.0524 
0.0908 
0.150 


0.02303 
0.0532 
0.0928 
0.154 


0.00027 
0.0008 
0.0020 
0.004 














Before giving examples of the use of the tables a few facts concerning point 
estimators will be noted since it is interesting to compare the point and interval 
estimates. Of course, point estimators, like confidence intervals, are not unique. 
Two that are commonly used are the maximum likelihood estimator and the 
minimum variance unbiased estimator. For the parameters in question these 
are: 

Mazimum likelihood Minimum variance 
Parameter estimator unbiased estimator 
P k/n k/n 
Pp? (k/n)? k(k —1) /n(n —1) 
P,P; kike/nins kiks/ning 


The second column can be derived by elementary methods and the last column 
can be established by the methods of Bhattacharyya [1]. 

To illustrate the use of the tables let us suppose that lifeboats are equipped 
with pistols for shooting off flares, and that two types, designated by A and B, 


~ 
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are available. Under tests simulating actual conditions of use, one pistol of 
type A in a sample of 100 has failed, while four of type B in a sample of 100 
have failed (k;=1, ke=4, m1 =n2=100). If one pistol of each type is placed in a 
lifeboat, what is the probability P:P; that neither will operate properly? Both 
point estimators give the estimate kk2/nynz=1X4/100?=0.0004. The 90% 
upper confidence limit from Table 3 is 18.8/100? =0.00188. If two pistols, both 
of type A, are placed in the boat, the failure probability is P,*. The point esti- 
mate is either 0.0001 or zero, depending on which estimator is used. The 90% 
upper confidence limit from Table 1 is (¢iou(1; 0.9))? = (3.89/100)?=0.00151. 

Suppose instead that (with the same sample sizes) two of each type have 
failed. Then the point estimate is kik2/njn2=0.0004 (as before), and the 90% 
upper confidence limit for P;P: is 16.8/100?=0.00168. Finally, suppose that 
samples of 150 of each type are taken and that three of type A and three of 
type B are found to be defective. The point estimate again is 0.0004, but the 
upper confidence limit is smaller than before, 28.9/150?=0.00128, as is reason- 
able. 

Confidence interval theory does not ensure that reasonable results will be 
obtained when individual intervals from different systems are compared as 
has been done above. However, such comparisons might be used as criteria for 
selecting a reasonable system of intervals to be used consistently. It is to be 
emphasized that the intervals given by Table 3 are not unique. But it is claimed 
that if they are used consistently, the inequality P;P2<¢n,n,(ki, ke; a) will be 
satisfied in at least 100a per cent of cases in the long run; and it is claimed 
that the solution is about as reasonable as any that could be given. 


3, ESTIMATION OF A SINGLE BINOMIAL PARAMETER P 


The conventional solution for estimating P is obtained by taking c,(k; a) as 
the value of c for which 


("ea -ort=1-a soos mn — 1] 


im \ 2 


and c,(n; a)=1. This result is given by Mood [5], for example. It is evident, 
however, that (3) is not implied by (1); for (1) could also be satisfied simply by 
taking c,(k; a)=1 (all k). There are many reasonable criteria which might be 
imposed in addition to the definition (1) which would lead to (3). We choose 
to do this by conditions “regularity” and “shortness” defined as follows: 


Regularity: If ki<k, then ¢,(ky; a) <¢n(ke; a) (4) 
Shortness: ¢a(k; a) should be as small as possible (5) 


It can be shown that (1), (4), and (5) lead to the conventional solution (3). 
It is readily seen that (3) does not follow from (1) and (4) alone. It is also 
true that (1) and (5) alone are not sufficient since the “regular” solution defined 
using (4) does not generally give a uniformly shortest system of intervals; thus 
(5) has no meaning without an added assumption such as (4). Systems of con- 
fidence intervals (including the system given by (3)) are most commonly de- 
rived by considering the distribution of a point estimator, such as k/n. Notice 
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TABLE 3 
UPPER CONFIDENCE LIMITS FOR P,P; 


The tabulated values are Poisson approximations to ninan,n,(ki, k2; a) where 
Cnynq(ki, k2; @) are upper confidence limits for estimation of P,P:. The solution is sym- 
metric in k; and kz: so that when kz <k, the result may be obtained by interchanging the k’s. 
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that conditions (4) and (5) make reference to a point estimator unnecessary. 
Although we obtain the conventional solution here, in more general problems 
one might wish to eliminate the point estimator in order to obtain a larger class 
of possible solutions. 

Charts from which numerical values of c,(k; a) can be obtained have been 
prepared by Simon [7], and charts of the corresponding two-sided intervals 
were first given by Clopper and Pearson [2]. 

One can obtain a useful approximation to c,(k; a) by supposing that n is 
large and c is small and by substituting Poisson terms for the binomial terms 
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TABLE 3—(continued) 








Confidence coefficient =a =0.95 
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in (3). This leads to the following result, which we may call the Poisson ap- 
proximation to c,(k; a): If \(k; a) is the value of \ for which 


kh 
er DrAvit=l—a k=0,1,2,--+ (6) 


t=0 


then the solution of (3) is given approximately by 
Cn(k; a) = (Kk; a)/n (7) 


Thus in the Poisson approximation, the upper confidence limits are inversely 
proportional to the sample size. The \’s are upper confidence limits for estima- 
tion of a Poisson parameter and are limiting values of nc,(k; a) as n tends to 
infinity. Table 1 was calculated from (6). The accuracy of the Poisson approxi- 
mation is best when n is large and k is small (for then c is small, as assumed 
above). 
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Before proceeding to the estimation of the product P,P; let us notice why 
the conventional interval estimate for P can be used to estimate P?. On making 
the trivial observation that P?<c? if and only if P<c, one has 


Prob {0 < P? < [e,(k; a) ]?} = Prob {0< P<e,(kja)} >a (8) 


Thus [c,(k; «) ]? is a system of upper confidence limits for estimating P?. More 
generally, if f(P) is an increasing function of P, then f(c) is a system of upper 
confidence limits for estimation of f(P). 


4. ESTIMATION OF A PRODUCT P;P; 


Consider now the problem of finding numbers Cp,,,(ki, k2; a) satisfying (2). 
As in the estimation of P?, it is possible to use the conventional estimate of P to 
derive an answer. For brevity let ¢; =c,,(ki; a) and c2=Cp,,(k2; a2). Since P)<qy 
and P:<c, imply P:P2<cc, (but not conversely) one has 


Prob {0<Pi:P:<cie2} >Prob {0<Pi<c} Prob {O<P:<e}>ma. (9) 


Thus for any fixed a; and a, a system of upper confidence limits with confidence 
coefficient a= aa: is given by 


Cayng(Ki, ka; a) = Cn, (hi; o1)Cng(Ke; axe) (10) 


A few values obtained in this way are given in Table 4. The solution obtained 
in this way is not a particularly good one, however, since all of the intervals 
can be made considerably shorter. 

To see what is involved in making the intervals as short as possible, let us 
consider any system of N =(n,;+1)(nm2+1) numbers ¢n,n,(ki, ko; a). These can 
be ordered in a nondecreasing sequence 


Cayng(ha™, ky; a) < Cayng(hi™, k, ; a) < i 
S Cayng(ki™, ka; x) (11) 


corresponding to one of the NV! permutations of the number pairs (k;, kz). This 
can always be done in at least one way, and in more than one way if some 
of the c’s happen to be equal. 

Suppose the ordering is fixed so that an index i (i=1, 2, - - - , N) identifies 
each sample point. Of all the sets of N numbers satisfying the inequalities (11) 
and the inequality (2) there is a unique set whose members are uniformly the 
smallest. This set is given by 


Cayns(ha, ka; a) = sup {p.,| - B; >i- at (12) 
isi 


where B; is the product of binomial terms defined by 
By = Bayng(ka, ka; Pi, Ps) (13) 


Banish; Ps, P) = (™) Pn — Pore(™) Pn — Pye (ag 


(That is, the summation > s<« B; in (12) includes the probabilities of all sample 
points whose index does not exceed 7.) The proof of this assertion follows di- 
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TABLE 4 


POISSON APPROXIMATION TO LARGER-THAN-NECESSARY 
VALUES OF nin2njn,(hi, k2} a) DEFINED BY (10) 


(ay = ag = 0.9487; a = aa, = 0.90) 








on 
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rectly from the definition of the supremum. The generalization of (12) to an 
arbitrary discrete distribution is indicated in Section 7. 

Thus if one accepts shortness of the intervals as a basic requirement, then 
a unique system of intervals is given by (12) as soon as any particular ordering 
of the intervals (11) is assumed. In the estimation of a single parameter P a 
unique solution (the conventional one) is established by the regularity assump- 
tion (4). In the estimation of P,P; however, many solutions will be regular in 
the sense that Cn,n,(/i, k2; a) is a nondecreasing function of k; (or of kz) for fixed 
ke (or for fixed k,). 

How then ae unique ordering of the intervals be selected? There seem 
to be many mor@or-less reasonable ways to do this, each of which leads to a 
different system of confidence intervals. After considering many criteria, the 
following was selected on the basis of reasonableness and ease of application: 


Cn (hr; Ve) Cng(ka; Va) S Cmy(ha’; Vex) enya’; Ve) 
implies Cninq( ha, ke; a) < Cnynq( ki’, ke’; a) (15) 


That is, we choose to order the intervals in the same way as those given by (10). 
with a, =a2= Va. 

The criterion (15) is admittedly arbitrary, and it is not claimed that the 
solution is “best” in any special sense. It would be possible to find other systems 
of intervals which would be about equally reasonable; on the other hand, it 
would be difficult to establish any decisively superior solution. Other criteria, 
too numerous to mention, were also considered; they were found to have little if 
any theoretical advantage and often considerable disadvantage in the calcula- 
tional labor required. Some reasonable properties of solutions obtained from 
(15) will be indicated presently. 

When the sample sizes are moderate or large and the values of k; and kz are 
small, then to a good approximation, B; in (12) can be replaced by a Poisson 
term. This is accomplished by replacing (14) by 


Bryng(Fa, ke; Pi, Pa) = (ea /ky!)(e*Aa**/Kea!) (16) 
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in which 
yi = mP,, Ae = noP (17) 


The main advantage of the Poisson approximation is that it makes possible 
the tabulation* of a quantity which is independent of mn, and nz and is valid for 
all large values of m and nz. 

Table 3 gives upper confidence limits calculated from (12) by using the ap- 
proximation (16). The ordering was determined from the condition (15), using 
the Poisson approximation (7) for c,(k, a). The calculational procedures are 
described in Section 6. 

A few reasonable properties of the solutions given by Table 3 are: 

(a) Ca,n.(ki, ke; @) is an increasing function of k, and k, for fixed m, nz, and a. 

(b) The intervals distinguish between the two samples (k;, kz) = (k,’, 0) and 
(ki, ke) =(ki"’, 0) (where k,’<k,’’); the point estimator kik,/n:nz fails to make 
this distinction. 

(c) Each upper confidence limit exceeds the corresponding point estimate 
kike/ NN. 

(d) When two sets of data give the same nonzero point estimate and the sec- 
ond set has larger sample sizes than the first, then one would expect the second 
set to give the smaller upper confidence limit (i.e., a value nearer the point esti- 
mate). The tabulated solutions have this property except in a few isolated 
instances where the deviations are limited to about five per cent of the interval 
length. 

In the Poisson approximation the problem becomes symmetric in k; and kz 
{otherwise this symmetry holds only when n;=n2). Tables3. gives symmetric 
solutions: Cain,(ki, k2; @) =Cnyn,(ke, ki; «). If this symmetry is rejected, one can 
arbitrarily favor intervals for which k,.<k,, making them slightly shorter. To 
illustrate, a few values have been calculated for this unsymmetrical solution; 
the results are given in Table 5. Table 5 favors shortness of intervals as com- 
pared with Table 3 which favors symmetry. In using Table 5 the arbitrary 
labelling must be done either in advance of the experiment or else on a random 
basis; there seems to be no reason for basing the choice of labels on the ratio 
n/N. 


5. USE OF AN AUXILIARY RANDOM ELEMENT 


The gain in shortness of intervals achieved by allowing an unsymmetrical 
solution can be developed further. The answer read from Table 5 is random in 
the sense that (if k; ke) it depends on how one happens to assign the labels 1 
and 2 to the k’s. To carry this further one can introduce an auxiliary random 
variable ¢ (to be taken from a table of random numbers, for example) and define 
a system of intervals which depends not only on the k’s but on ¢ also. To do 
this is simply to allow a “mixed strategy” in the game-theory sense. Solutions 
of this type have been given by Stevens [8] and by Eudey [3] for estimating a 
single parameter P. Their solutions are such that inequality can be replaced by 
equality in the definition (1): 


Prob {0<P<e,(k,t;a)} =a forallO0 <P <1 (18) 








CONFIDENCE INTERVALS FOR THE PRODUCT 


TABLE 5 
UNSYMMETRICAL VALUES OF ninotayn,(ki, 2; 0.9) 
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In addition, their intervals are never longer than those given by the conven- 
tional solution: 


Cn(k, t; a) < Cn(k; a) for all k, t,n, a (19) 


In the estimation of P,P: one could achieve an analogue of either (18) or (19) 
separately, but not both simultaneously. 


6. CALCULATION OF UPPER CONFIDENCE LIMITS 


In this section we describe the way in which numerical values of upper con- 
fidence limits may be obtained from the defining equation (12). From the as- 
sumed ordering of the intervals (given, for example, by (15)) one determines, 
for given values of k; and ke, which points of the sample space correspond to 
intervals whose lengths are less than or equal to the length of the desired in- 
terval. The binomial terms (14) which correspond to such sample points are 
then summed. The expression }->B;, which represents this sum, always takes 
values in the range 0 < dB; <1 when P; and P; are in the unit square O <P; <1, 
0<P2<1. The expression }\B;=1—a defines a curve in the unit square; on 
one side of the curve one has }>B;<1—a and on the other side, }>B;>1—a. 
The upper confidence limit is the supremum value of P;P:2 for the region defined 
by >-B;>1—a. For the assumed ordering this supremum will be equal to the 
maximum value of P,P; on the curve ).B;=1—a. To obtain a numerical value 
for the upper confidence limit in general it is necessary first to fix the location 
of the curve }>B;=1—a numerically and then to find the maximum value of 
P,P, on the curve numerically. In finding the Poisson approximation the pro- 
cedure is similar; one works in the infinite region 0<):, 0<A¢2 instead of in the 
unit square, and one searches for maximum value of \;Az; on a curve in the 
infinite region. For the values in Table 3, all of the maxima lie on the line 
\1 =A, but for the unsymmetrical values in Table 5 this is not the case. The 
tables of Molina [4] are useful for these calculations. 

A very simple case can be given as an illustration. The values k; = k,=0 give 
the smallest upper confidence limit so that the sum contains only one term: 


dB; =(1 — P,)™(1 — Py)" =1—a 
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If we simplify further by putting n.=n2.=n, then the maximum value of P,P; 
occurs at the point where P; = P2. Solving (20) gives 
P, = P; = 1—-—(1—a)/™ (21) 
and the upper confidence limit is 
Can(0, 0; a) = PiP, = [1 — (1 — a)" ]2 (22) 
Using the Poisson approximation, (20) becomes 
ere’ =1l—a (23) 


This represents a straight line in the \,\; plane having slope —1, both inter- 
cepts being equal to —In(1—a). The maximum value of ),); on this line occurs 
at 


Ai = Ae = — $1n (1 — a) (24) 
which gives the approximate upper confidence limit 


AiA2 1 
Can(0, 0; a) = PP: = = — [In (1 — a)}? (25) 
4n? 


NyNe 


The last, expression is equal to the first term of (22) expanded in reciprocal 
powers of n. 


7. GENERALIZATION TO AN ARBITRARY DISCRETE DISTRIBUTION 


Equation (12), which defines a shortest system of one-sided intervals, sub- 
ject to the inequalities (11) has a rather obvious generalization to an arbitrary 
discrete distribution. Suppose that a (perhaps multidimensional) random vari- 
able X takes discrete values 7, 22, ---, and suppose that Prob {X=2;,} 
=p(2i; 01, 02,- ++, Om) where the 6’s are parameters of the distribution. If 
$(0:, 92, - + -, Om) is any function of the parameters, then a function c(X; a) 
represents a system of upper confidence limits for the estimation of @ provided 


Prob { o(A, O2,--°, 9m) S c(X; a) } >a _ for all, 02,---,n (26) 


Suppose that any system of upper confidence limits c(z;; «) is given. Then the 
quantities c’(z;; a) defined by 


e' (2x5; a) = sup {o(e, 02, Tare On) | L v(x; A, 42, pila » 9m) a2i- at (27) 


(where the summation is taken over all values of j for which e(x;; «) < c(2z;; «)) 
are also a system of upper confidence limits and are uniformly smaller than the 
given ones, i.e., 


ce’ (x4; a) < c(i; a) for all 7. (28) 
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OPTIMUM SAMPLING IN BINOMIAL POPULATIONS! 


Paut N. SoMERVILLE 
General Analysis Corporation 


1, INTRODUCTION 


n choosing the sample size for determining which of two populations has the 

larger mean, one ordinarily states a difference which is considered worth 
detecting, and a confidence with which one wishes to detect a difference of this 
size. Given this information, it is easy to calculate the minimum required 
sample size. The experimenter, however, must still face the problem of what 
difference and what corfidence coefficient to choose. Actually, these may be 
considered as functions of the cost of sampling, and the amount and kind of 
use to be made of the result, i.e., the cost of making an erroneous decision. 
Assuming we are given two binomial populations, this paper will present a 
method of giving the required sample size as a simple function of the cost of 
sampling, the amount of use to be made of the result, and the loss per unit in 
choosing the wrong population. This sample size minimizes the maximum ex- 
pected loss over possible values of the population parameters. 

A discussion of statistical decision theory and in particular of the minimax 


principle is given in [5]. Some applications are given in [1], [2], [4] and [6]. 


2. EXAMPLE 


An experiment station worker wishes to know which of two hog diets he 
should recommend for the following year. To enable him to decide, he plans to 
feed each diet +o the same number of hogs (say n). Since the diets are equally 
expensive, he will recommend the diet which produces the greater number of 
Grade A hogs. How large should he make n, the number of hogs which will be 
fed a given diet? 

Let the diets have probabilities @) and 6, of producing a Grade A hog. Suppose 
that the experimenter is willing to specify two constants d’ and a such that if 
|@)—@| >d’ then the probability is at least 1—a that he will choose the better 
diet as a result of his “preliminary experiment.” It is not difficult to show that 
a good approximation to the required n is then given by the formula 
N=2;_"/2d’*, where zp is the Pth percentile of the standardized normal dis- 
tribution. It is clear that the required sample size is a function of a and d’, 
although it is not at all obvious how the experimenter should choose a or d’. 

Let C(n)=co+cn, be the cost of the preliminary experiment (i.e., the cost 
is a linear function of n). Let the loss involved in choosing the poorer diet, 
W(:, 8) be proportional to |@o—6,|,, i.e., W(0:, 0) =K | @o—6,| . Then in the 
following sections we will show that n =.1534 (K/c)?/* is the sample size which 
minimizes the maximum expected loss. 





1 Prepared in part while the author was at the Virginia Polytechnic Institute, under an Agricultural Marketing 
Act Contrast, Project RM: c-629-1, with the U.S. Department of Agriculture. The contract was administered by the 
former Bureau of Agricultural Economics. 
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In our example, suppose that a Grade A hog brings a return of ga dollars, 
while a grade other than A brings a return of gz dollars (ga1>gz), then the 
expected loss per hog from choosing the poorer diet is 


| [9490 + ga(1 — %)] — [gat + ga(1 — 0] | = (ga — ga) | % — %|. 


Suppose N is the number of hogs which will be fed a given diet as a result of our 
recommendations. Then, in our formula for n, we may put K =N(g4—gz), and 
our recommended value for n is n=.1534N* (g4—gz)*/*/c,?/*. This “optimum” 
sample size is a function of the costs of sampling, the amount of use to be 
made of the result, and the loss involved in choosing the wrong population. 


3. MODEL 


Assume we have two binomial populations, with parameters , 0:, where 
6,>6,. A sample of size n is taken from each population, and the one with the 
largest number of successes is selected (in case of a tie, the populations are each 
selected with probability }). Let W(@:, 60) be the loss incurred in selecting the 
population with parameter 6;, i.e., the poorer population, and let p, be the 
probability of choosing this population. Let C(n) be the cost of sampling n 
individuals from each population. Then, we may write our total expected loss 
or cost as 


L = W(h, 9)pr + C(n). (1) 


In many practical problems it appears that W(6,, 00) is well represented by 
K|@.—@,| where K does not depend on 6, 6 or n. Thus we shall put 
W = K|@o—6,|. The methods which will be used for obtaining the “optimum” 
value of n can be used for a wide class of functions W, where W = W(@.—,). 

4. MAXIMIZATION OF L OVER VALUES OF 90, 0, (n LARGE) 


In this and the following sections, we give only a heuristic development. A 
rigorous development is given in the appendix. 
Let xo, 2: be the number of successes in samples of size n from the populations 
having parameters 99, ,, respectively. Then, 
pi = Prob [x9 < a1] + Prob [xo = 2:]/2. (2) 
Put 
4 + A; = 1 + k, 


3 
0) — 6, = d. (3) 


Then, corresponding to the restriction 0<6,<0)<1, it can be shown that 
- REAM, 


O<d<1-—|kl. ) 


We shall now show that for n large, for any k, the maximum of L over d 
occurs where d=0.5316 (1—k?)*/2/n¥? and is equal to 0.1202 K(1—k?)"2/n¥/? 
+C(n) (approximately). Now Prob [zo<2:]=Prob [y,<—u] where 
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Mi Lo — 2% — 2(89 — 4) 
Vnbo(1 — Oo) + nA,(1 — 4) 
Xo — 1% — nd 
~ A — k* — dyn/2 





Yn 











am Vn(o — 61) 
V4(1 — 4) + (1 — 6) 
J/2nd 
~Vi-P—-@ 
Thus, since for large n, Prob [xo=2:] is negligible, 











pi = Prob [y, < — u] (approximately) 


Eyn = 0, 
Ey,’ = 1. 
Set 
M =d-p 
Then, for large n, 
M =d-Prob [y, < — u] (approximately) 
= M(d, n, k). 
Thus, 
L = KM + C(n). (11) 
Using the Central Limit Theorem for large n, 
M =d-F(— 1), (approximately) (12) 


where F is the cumulative distribution function of the standard normal variate. 
Solving for d in (6), we obtain ‘ 
u(l — k*)¥/? 


= on pupal (— u). (13) 


Since the maximum of u-F(—1) is 0.1700 and occurs when u=.7518, then, for 
large n, 


Max M = 0.1202(1 — k*)¥/?/n1/2 (14) 
4 


and occurs when 
d = 0.5316(1 — k*)¥2/n¥2, (15) 
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Expression (14) is maximized over k when k=0, and thus 


Max L = 0.1202K/n"”? + C(n) (approximately). (16) 


99,9; 


If we put C(n) =co+en, (16) is minimized with respect to n when 


n = 0.1534K*/*/c,7/* (approximately). (17) 


This is the sample size which minimizes the maximum expected sown L over all 
possible values of 6, 0) (“large” n). For this value of n, 


L = 0.4603 K?*c,"* + ep. (18) 


5. MAXIMIZATION OVER VALUES OF 60, 6; (n SMALL) 


In the previous section, we have shown that if n is large, Z is maximized 
over values of 4, 6;, when d=0.5316/n"/? and k=0, i.e., 0.=0.5+0.2658/n/? 
and 6, =0.5—0.2658/n"/*. If n is small, then Prob [x»=2,] may not be negligible. 
Table I shows the values of Maxe,.., M for small values of n, as well as 
0.12019/n/?, the corresponding asymptotic value. It is remarkable that even 
for n=1, the exact and the asymptotic values differ little. 

Table II gives the values of d for which M attains its maximum, as well as 
the corresponding values of 0.5316/n/?, 


6. USE OF A PRIORI KNOWLEDGE OF 4), 09 (m LARGE) 


Occasionally we may know something about the values of @ and @,. For 
example, we may be reasonably sure that a<@.+0,=1+k<b. If the interval 
does not include 1, then we may set 69+, equal to a or b, whichever is nearer 
to 1, and compute the corresponding value of k. For large n, the maximum of 
M over all values of d is given by (14), from which the “optimum” sample size 
may be computed to be 


n = 0.1534K?/9(1 — k*)'8/c7/8, (19) 


This maximum loss occurs for d=0.5316 (1—k*)/*/n/*, For small n, the ap- 
proximation for the maximum of M over d given by (14) is fairly close provided 
that the absolute value of k is not too large. Table III gives a comparison of 
Max, M and the approximation given by (14) for n=1, 2, 3, and k=0, +.5, 
+.9. 


7. CASE WHEN NO SAMPLING IS BEST 


If we had not bothered to take a preliminary sample, but had decided to 
flip a coin to decide from which population we should choose the N individuals, 
then our maximum loss would have been 


L = K(0 — 6) /2 
= Kd/2. (20) 


0.4603K?/*(1 — k*)"%e,"8 + co > Kd/2 
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i.e., 
co > Kd/2 — 0.4603 K/4(1 — k?) "8,18 (21) 


it is better not to take a preliminary sample. If we are able from previous ex- 
perience to state an upper bound for d, then we should use that value in (21), 
otherwise, set d=1. 

Cases where a preliminary sample would not be worth taking would be when 
Co, the cost of setting up the preliminary sample was very large, or when K 
was small. Since in practice K will usually be proportional to N, K will be 
small when the amount of use to be made of the result is small. 


8. ACKNOWLEDGEMENTS 


Lemma 4 of the appendix was proved jointly by the author and Lester R. 
Ford, Jr. The author is also indebted to one of the referees for a number of useful 
suggestions. 




























































































TABLE I 
COMPARISON OF Maxa,o,M AND .12019/n¥2 FOR SMALL n 
: | 1 | 2 3 4 | 5 | 6 7 | 8 9 | 10 
Max M .12500 | .08702 | .07055 | .06087 | .05431 | .04950 | .04577 | .04277 | .04030 | .03821 
8,6 
0.12019/n" | .12019 | .08499 | .06039 | .o¢000 | .08375 | .o4907 | .04843 | .04249 | .04008 | .ogso1 
TABLE II 
COMPARISON OF VALUE OF d AT WHICH Maxa,o, M OCCURS 
AND d=.5316n”2? FOR SMALL n 
a | 1 | 2 | 3 | 4 | .. ¥ > | 8 | 9 | 10 
iin 0.5000 | .3660 | .3018 | .2626 | .2355 | .2153 | .1906 | .1869 | .1763 | .1673 
0.6316/n" | 0.5316 | .8750 | .3069 | .2658 | .2377 | .2170 | .2000 | .1870 | .1772 | .1681 
TABLE III 
COMPARISON OF Maxa M AND .12019(1—k*)/n¥”? FOR SMALL n 
k=0 k=4.5 k=+.9 
n 
1 2 | 3 1 2 3 1 2 | 3 
.12019(1—#)"/n¥* | 1202 | .0850 | .0694 | .1041 | .0736 | .0601 | .0524 | .0370 | .0303 
MaxaM :1250 | .0870 | .0706 | .1250 | .0795 | .0627 | .1250 | .0669 | .0470 
Ratio ‘962 | .977 | .984 | .833 | 9296 | .958 | 419 | .554 | .644 
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APPENDIX 


In this appendix we shall give a rigorous proof that for n sufficiently large the 
maximum of M =d-p, over all possible values of 60, 6; is 0.1202/n/?+0(1/n) and that this 
occurs when d=0.5316/n'/? approximately. Further, if we have a priori information 
that |k| >k’, ie. that |@0+0,—1| >k’, then for n sufficiently large, the maximum value 
of M over possible values of @o, @:, is 0.1202(1 —k’*)/2/n/2+-0(1/n), and this occurs when 
d =0.5316(1 —k’?)/2/n/? approximately. 

Lemma 1. A necessary conditicn for n/?M >0.12 is that 0.12 <n/*d <4.2. 
Proof: From (2), we have 
pi < Prob [zo < x] 
= Prob [yn < — u] 
< 1/u* by Tchebycheff’s inequality 
= (1 — k? — d*)/2nd?. 


n?M = nV2d- p, 
< (1 — B— d*)/2n'"*d 
< 1/(2n? d) (23) 
and for n/?M to be greater than 0.12, n'/*d<4.2. Also, since p: <1, 
n/2M < n'/%d (24) 
and for n/?M>0.12, we must have n”2d>0.12, and the lemma is proved. 
Lemma 2. If 0.12<n*d<4.2, and |k| <k’’<1, then |d-F(—u) —d-p,| <c/n for n suffi- 
ciently large, where k’’ and ¢ are positive constants. 
Proof: By the Berry Esseen theorem [3], 
| F(— u) — p| < c’pa/n’? (25) 


where c’ is some positive constant, and p; is the ith absolute moment of y:. 
It is not difficult to show that the fourth moment of y; is given by 


py = 2/(1 — k® — d*) — 12k%d*/(1 — k? — d?*)?, 
Using the Schwartz inequality, we can show that 
Ps S (pops)? = pyi!? 
< 22/(1 — k? — d?)"2, (27) 


Suppose n >4.2?/(1—k’?—e) where « is a small positive constant. Then n‘/*d <4.2 im- 
plies that d?<1—k’?—e, and for | | <k’’, px<c’’ where c’’ =(2/e)"/. 
Thus, for nd <4.2, |k| <k’’, and n sufficiently large 


|d-F(—u) —d-pil <e/n, 
where c=4.2c’c’’, and the lemma is proved. 


Lemma 8, 


| Max d-F(— u) — 0.1202/n¥*| < 0.017/n¥# 
0 


and this occurs when k =0, and for some d, where 0.463 <n"/*d <.5316. 

Proof: The maximum value of uF'(—u) is 0.1700, and this occurs when u=0.7516 
approximately. Also (1 —k?) is maximized for k =0. Further, 1/(2n+-u*)"? is a decreasing 
function of u. Thus for any n, the maximum value of d-F(—u) =u(1—k*)/?-F(—u) 
/(2n+u?)/? occurs for some value of u<0.7516 (it can be shown that the value of u at 
which the maximum occurs increases from u=0.655 when n=1 te u=0.7516 when n ap- 
proaches infinity.) It is easy to show that 
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1 — 22/2 < 1/(1 + 2*)¥? <1, <1. (28) 
Thus, for u<1, we may write 
u(1 — k*)8F(— u) 

(2n)"2 


u(l — k*)/*F(— u) 
(2n) 1/2 








[1 — u/4n) <d-F(—u) < (29) 





|@-F(- w) 


ss u(l — k*)¥2F(— u) < u(1 — k*)¥3F(— u) ue 
(2n)¥/2 | = (2n)¥/3 4n 


Maxd- F(— u) — 0.1202/n'/?| < 0.017/n*? (31) 
O71 


and the maximum occurs when k=0, and for some n/*d in the interval from 0.463 to 
0.5316 approximately. 


Lemma 4. For any values of n and d, p: is a non-increasing function of | k]. 
Proof: Let a; be the sum of all the coefficients of y™z* such that s—m=j in (oy 
+1 —6o)*(A2+1 —6;)". Then 
pi = >> Prob [z, — 2» = j] + Prob [x — 2% = 0]/2 (32) 


j=l 


= > a; + /2. (33) 


j=l 
Let y=1/z. Then 

(Ooy + 1 — 6o)*(Aiz + 1 — 0:)* = Bo*Bi*/2" (34) 
where 


Bo = 00 + x — Oox, 
B, = 62 +1 — &. 


Thus a; is the coefficient of z in 2y*B,"/xz*. Now we may write 


BoB," /z*(1 — 2) = > bx*, 


t——n 


bh = oo aj. 


j=—n 


Pr = bn — bo + (bo — b_1)/2 
= b, — (bo + b.1)/2, 


pi = coefficient of z* in Bo®By*(2 — 2* — 2**)/2z%(1 — 2) 
= coefficient of z* in T, 


T = By By(2 - 2 — 2**')/2(1 — 2). 


2B, =1+k+d+2 — zk — ad, 
2B, =2+ke —de+1—-—k+d. 











OPTIMUM SAMPLING IN BINOMIAL POPULATIONS 501 
Thus 








oT — nk 
SE 7 mn” Be By-(1 — 2)(2 — 2 — 2+) (41) 
and 
a — nk 
= = coefficient of z* in Ber By"12"*(z? — 1). (42) 
If 
on 
n 1, Ok 0. 


For n>1, using the methods outlined above, it is not difficult to show that for a sample of 
size n—1, 


Prob [zo — 2; = 1] = coefficient of z™ in Bo*-1B,"—12"*2 (43) 
and 
Prob [zo — 11 = — 1] = coefficient of z™ in Bo*-'B,"—'2". 
Thus 
ri) — nk 
= -— (Prob [zo — 2%: = 1] — Prob [ze — m1 = — 1)) 
—nkpetyn—-1\(n-1\_.. . 
= — = (* ; ) “es ) 60°0;*(1 — 69)"-*-2(1 — 0,)"-#-2(0) — a) |: (44) 


The bracketed portion is never negative for 49>6; and thus ép,/dk =0 or has the sign op- 
posite that of k, and p; is a non-increasing function of k. 
Using lemmas 1, 2, 3 and 4, we have 


Theorem 1. For sufficiently large n, 


Max 6:9: — 0.1202/n"*| < e/n + 0.017/n*/? 
0. 





and the maximum occurs for k=0, and for some d in the interval from 0.463/n‘/? to 
0.5316 /n"/?, 
Use of A Priori Knowledge 


Suppose we have a priori information that |@o+0;—1| >k’, where k’ is some positive 
constant. That is, | k| >k’. Then we wish to obtain Max,¢, d-p; subject to the above re- 
striction. Let Max’ f represent the maximum of f subject to |k| >k’. It is clear that 

Max’ d-F(— u) = Max’ (1 — k*)"2uF(— u)/(2n + u*)¥? 
= (1 — k’?)2 Max uF(— u)/(2n + u*)¥2 (45) 
u 
if the maximizing value of u (between .655 and .7516) is attainable when | k| =k’. But 
when |k| =k’, d ranges from 0 to 1—k’ and u=(2n)/*d/(1 —k? —d*)"? is increasing in d 
and ranges from 0 to n‘/2(1—k’)/2/k’/2, The maximizing value of u is attainable if 


n/2(1 —k’)/2/k"/2 >.7516, or n>.5649k’/(1—k’). Thus following the steps in the proof of 
lemma 3, we have 


Lemma 8’. For n>.5649k’/(1—k’), 
| Max’ d- F(— u) — 0.1202(1 — k’*)/2/n*| < 0.017(1 — k’*)/2/n¥?, 
The maximum occurs for |k| =k’, and for some value of d in the interval from 
0.463(1 — k’2)¥2/n¥2 to 0.5316(1 — k’*)/n‘/? 


approximately. 
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Using lemmas 1, 2, 3’ and 4, we have 


Theorem 1’. For sufficiently large n, 
| Max’ d-p; — 0.1202(1 — k’2)/2/n/?| < c/n + 0.017(1 — k’2)¥2/n¥/2, 
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ESTIMATES OF SAMPLING VARIANCE WHERE TWO UNITS 
ARE SELECTED FROM EACH STRATUM 


NaTHAN KeyFitz 
Dominion Bureau of Statistics 


Recent years have seen an increased demand for sampling methods 
which permit simple computations of means and variances. For efficient 
designs the calculation of variances though theoretically possible may 
be difficult enough that busy statisticians omit to do it, and so a basic 
advantage of probability sampling is lost. A practical method for the 
simple treatment of cluster designs with slight loss of efficiency has 
been devised by Deming;' it involves an ingenious arrangement of the 
population in such fashion that both the sampling units and the strata 
in which they are grouped are effectively equal in size, so that equal 
weights are appropriate for estimating both totals and variances. 

The present paper describes a simplification in the different direction 
of restricting selection to two units from each stratum. To the extent 
that this involves some sacrifice of efficiency we recall that the decision 
on which sampling method to use ought to take account of simplicity of 
calculation of mean and variance as well as of efficiency. 


1. VARIANCE OF ESTIMATES OF POPULATION TOTALS 


CONVENIENT starting point is the proposition that the variance of a sum of 
two values of a random variable is estimated by the square of the difference 
between them: 


Vari(tu + 22) = E(an — 212)’, (1) 


where 2, and 2, are estimates made from two samples drawn independently 
at random from the first stratum, and F as usual stands for the operation of 
averaging over all possible samples. The proof depends only on 


Ezy = Ex. and E(21212) = (E21) (E22) 


which conditions will apply if z1 and 22 are randomly selected with replace- 
ment from the same population. (All the formulas developed in this paper apply 
approximately if selection is without replacement; one might use a finite popu- 
lation correction but in practice I have not done so.) A simple proof is as follows: 


Var (tu + 12) - E(au + tw — E(2un + %12))? 
= E(t, — Exy + tu — Ex)* 
_ 4E {rn - Ex) (X12 ~ Ex) } ’ 





since the last term is zero on our assumptions, 


= E(ay, — Eryn — t%12 — Ex)? 


= E(u fn X12)*. 





1 W. Edwards Deming, “On simplifications of sampling design through replication with equal probabilities and 
without stages,” Journal of the American Statistical Association, 51 (1956), 24-53. 
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The fact that the expected value of the squared difference equals the variance 
in which we are interested is hardly of use in any one stratum, but it becomes 
useful if a number of such strata can be added. If x» and x2 are from the 
second stratum, then a similar expression holds for the information on variance 
contained in this stratum and 


Var (2 + 22 + tn + 22) = E{ (21 — X12)? + (22 — 222)?} : (2) 


When further strata are added it is necessary only to add further squared dif- 
ferences to the expression following £ in (2) in order to obtain an estimate of the 
variance of the total. 


2. COVARIANCE BETWEEN TWO ESTIMATES 


It is assumed above that drawings are independent as among the several 
strata and there is therefore no need to consider cross-products of terms in 
different strata. However, another characteristic may be measured in the same 
selected units of the several] strata; denote it by y with subscripts corresponding 
to those of z. We can then conveniently estimate the covariance of the sample 
totals for the x and y characteristics in two strata by employing the identity: 


Covar (a1: + t12) (Yu + Yi2) = E(2un ae X12) (Yu ae Yi2)* (3) 


This relationship is established by an obvious extension of the argument used 
for (1) and it is similarly additive through any number of strata. No particular 
relationship is assumed between the z’s and the corresponding y’s, but we do 
assume zero correlation between z’s and y’s not referring to the same unit. For 
example, 21 and yi: may be correlated in successive drawings of the sample, 
but zy and y2 are not. 

By application of (1) and (3) it is possible to express the variance of any 
linear combination of z’s and y’s as a perfect square. Thus, if we wish to weight 
the two characteristics differently, the first by p; and second by po, the variance 
of the weighted sum is estimated by a simple squared expression (which may 
be extended to any number of terms): 


Var { ~n(au + 22) + po(yu + yr2) } - E{ p(n — 22) + p(y — Yiz) }2. (4) 


3. VARIANCE OF RATIO ESTIMATES 


Formula (4) however is not a common application; more often the y’s are 
required for ratio estimates. For example, each x might be the number of un- 
employed persons in a primary sampling unit selected in the sample, and each 
y might be the corresponding count of persons in the sample. If we know the 
total population from sources outside the sample it may be efficient to use the 
sample to estimate only the ratio 


tu + 22 
See 
Yu + yu 


the number of unemployed persons per person in the population. According to 
the usual approximation for the ratio estimate, if V? symbolizes rel-variance 
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as defined by Hansen, Hurwitz, and Madow,? i.e. the variance divided by the 
square of the expected value, then 


uU v \? 
Var? = E(= - =) ave + Vit = 2V ue" 
Eu Ev 


The effect of this approximation is to provide a linear estimate such as that 
discussed in the preceding section. We now write u=21+22 and v=yu+y2. 
It may be shown that 


T11—~ Zi2 Yu eg Yi2 : 
Veentaniintns® = B( ) . 
(211+212)/ (9114912) E(zu + X12) E(yu + Yaz) . 





We will prove a more general statement, involving not one stratum but a num- 
ber of strata; s and s’ designate these strata. 


ie > (21 + 22) / Zz (Yar + Yar) 


= Vx ceurten) HV x vert ved — 2V 2¢ceerteen)Zelvert ven 
E D(a — ta)? ED (Ya — ya)? 

* oe = 
{eDieutzah? {8 Dow t+va)} 


BY 2 (21 — Len) (Yar — a) 





-2 





EB} Xu (201 + Lea) u (Ya + va) 


~ 42 Yor —~ Yor a 
=E 
Pip» Do z 1 + 22) 2 > (yen + 7a j (7) 





This sum of squares, one contributed by each stratum, is convenient for cal- 
culation, and if the addition is through a sufficiently large number of strata the 
E’s may be dropped to provide an estimate of V’. 


4. VARIANCE OF POST-STRATIFIED ESTIMATES 


However, devices far more elaborate than the ordinary ratio estimate are 
introduced to attain efficiency in sample surveys. If the proportion that is un- 
employed varies substantially from one age-sex group to another, and the true 
total number of persons in each age-sex is known from sources outside the 
sample, the sample need be used to estimate only the proportion unemployed 
in each age-sex group rather than the absolute number unemployed or the 
proportion unemployed in all age-sex groups together. The argument presented 
above can be applied to each age-sex group separately. One might think of 


2M. H. Hansen, W. N. Hurwits, and W. G. Madow, Sample Survey Methods and Theory, New York: John 
Wiley and Sons, Inc., 2 vols, 1953. 
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doing an entire survey merely in order to find for example the percentage un- 
employed among males aged 20 to 24, and then another survey for males 25 
to 34, etc. If such surveys were carried out independently for all the age-sex 
groups, the estimate of the number unemployed would be 


b (ea1 + Lea2) 
x pis (Ysa1 + Year) . 





(8) 


where pz is the precalculated total population in the ath age-sex group. 
In these circumstances the variance of (8) would be estimated by a weighted 
total of expressions such as that under the EZ in (7). If we write for brevity 
> (Lora + Lv'a2) 
doa = = 





| Leal — Lsa2 Yeai You Yea2 


> (Ys'at + Ys'a2) »D (a1 + Lsra2) by ) > (Yerat + Ys'a2) 


and adopt at this point the inevitable estimating procedure: 
EX (ava + tea) = x (tear + 2ea2), 
then the variance of the estimate (8) is given by 
p (‘pe Did.) : (9) 


This method of estimating variance will however not apply if data on the 
several ages is obtained from a single sample; the difficulty then arises in com- 
puting the sampling error of (8) that the results for the several age-sex groups 
are not independent. The lack of independence results from the use of strata 
involving criteria other than age and sex. The expression for the variance of (8) 
would therefore require (if seven age groups and two sexes are recognized) the 
evaluation of (7) fourteen times as in (9), and in addition the evaluation of ex- 
pressions analogous to (7) for the 91 covariances among the fourteen age-sex 


_ Classes. 


It fortunately happens that the covariance between the a and b“ age-sex 
groups can be estimated separately in each stratum and then added through 
the strata; the covariance of the s* stratum, furthermore, turns out to be 
d,ad». The proof of this is analogous to that in support of (7). Hence the vari- 
ance of (8) where we cannot assume independence of the several age-sex groups 
is 


> (> a.) +2> (nm dus) 


a b<a 


=) {x Po*dea? + 2 > pep 


-z ( x Paine) 
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It is because in each stratum the contribution to the covariance term is exactly 
equal to twice the product of the square roots of the corresponding variance 
terms that we can write the entire stratum contribution to variance as a square. 
Thus (10) is the variance of (8) whether or not the sampling for the several 
ages is carried out independently. It is fortunate that (10) does not involve 
explicitly either variances or covariances of the several age-sex groups, but may 
be thought of as a computing device for obtaining the identical result by simple 
weighted totals of 14 differences, squared and added through the strata. (10) 
is the principal contribution of this paper. 

One possible simplification of (10) deserves notice. It is pointed out to me by 
Max Bershad that when the expected value of the y’s is the same as the pre- 
calculated population p, then a cancellation is possible by which both p and y 
disappear outside the squared difference. If the cancellation is permissible then 
the only change in formulas (9) et seq. is that the d’s lose their weights in z/y 
and the p’s are replaced by 2’s. This cancellation, however, is not proper insofar 
as nonrespondents have been omitted from the z’s and y’s with the intention 
of making up the true total by multiplication by p. 


5. OVERALL VARIANCE OF MULTI-STAGE SAMPLE 


For simplicity in the exposition no mention has been made of sampling at 
several stages. The stages may be accommodated for calculating over-all 
variance by appropriate interpretation of the equations above, without any 
formal change in them. The interpretation of the x’s that makes the equations 
suitable for stage sampling is that each is the estimate of one half of the total 
of the stratum from which it was drawn. Thus x may be based on replies ob- 
tained from all the persons within clusters of five households drawn with sam- 
pling fraction f; from chosen census enumeration areas; these enumeration 
areas have been drawn with sampling fraction f. from the chosen psu’s; the 
psu’s have been drawn with sampling fraction f;. Then zy is the estimate of 
total of the characteristic for the half-stratum obtained by multiplying the 
sample take (i.e., the total in the sample of individuals with a given attribute) 
by the reciprocal of the product of the three sampling ratios and adjusting for 
nonresponse. There is no need to show the f’s explicitly in the formulae, nor, 
once fi fof; is known, to consider which sample households were selected from 
which enumeration districts or from which clusters. The random process whose 
variance is under investigation is the making of estimates from successive 
sample drawings using a similar three-stage method but not retaining selected 
units of any stage from one drawing to the next. 


6. PARTITION OF VARIANCE INTO STAGES 


While to calculate the over-all sampling variance requires no identification 
of the sample except by primary sampling unit, the breakdown «of this over-all 
variance by stages does require such identification. Formulas identical with 
those above given can be applied at the second stz;;e of selection, say census 
enumeration areas. Here again the sample take of each enumeration area 
is multiplied up to constitute an estimate of its portion of the population, i.e., 
by fifefs and the factor for nonresponse. Where more than two enumeration 
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areas are selected within each primary sampling unit, the formulas above are 
applied by taking pairs of enumeration areas, considering each pair as a 
stratum, interpreting 7 as equal to the sample take in a given enumeration 
area multiplied by the same reciprocal of product of sampling fractions as 
before. Similarly for further stages. 

If for a particular purpose the number of squares required is substantially 
less than the number of strata (as is likely in all stages after the first), one may 
group strata to any extent desired and correspondingly reduce the number of 
squares of differences to be calculated. Once again all formulas stili apply but 
the interpretation of the symbols is altered. Suppose we wish to amalgamate 
the first and second strata so as to secure ouly one comparison for the four 
sampling units involved, and then do the same for the third and fourth strata, 
etc. We simply replace 2 by tu-+2m and 22 by 212+2%2; In by %1+7u, etc., 
in the formulas above. As before each z is the sample take multiplied to provide 
estimates of population totals. 

In analyzing the results to ascertain the contribution to error of each stage 
one takes successive differences of the gross variances found at the several 
stages by the formulas above discussed. Thus when from the over-all variance 
the variance obtained by pairing census enumeration areas is subtracted the 
difference must be an estimate of the variance arising between primary sam- 
pling units; similar differences estimate net variances which are attributable to 
second, third, etc., stage units. If the gross variance estimated by the applica- 
tion of (10) to primary sampling units is called S,*, the gross variance obtained 
by applying (10) to second-stage units (census enumeration areas) is called 
S,”, etc. then the net variance at the psu stage is simply s,2=S,?—S,’; the net 
variance at the second or census enumeration area stage is s,2=S,?—S,*. The 
variance of the estimated net variance for the first stage will be at most the 
sum of the variances of the estimates of gross variance of the first and second 
stages. This may be large, however. It is unfortunate that the figure which 
one would wish was subject to the smallest percentage error from the viewpoint 
of this process of successive differences—the estimate S,? of variance arising 
from all stages together—is the one for which the number of comparisons (i.e., 
pairs) is most severely limited. 


7. VARIANCE OF VARIANCE ESTIMATES 


When only two units are selected from each stratum no unbiased estimate of 
the variance of the estimated variance is available. Biased estimates are ob- 
tained by “collapsing” (using the expression of Hansen, Hurwitz and Madow) 
pairs of strata and assuming that for the two “neighboring” pairs which are 
assimilated together the true variance is the same. We will in this section how- 
ever suppose that four units have been drawn from each stratum. The proposi- 
tion that the variance of a sum is estimated by the square of the difference 
(Formula (1) above) is again useful; our calculation may start with the variance 
of the sum 21+212+2n+2%2. This is equal to the sum of two variances, namely 
of 211+22 and of rm+22, ard may accordingly be estimated by the difference 
between the estimate of variance of 21+2,2, and of tm+222; thus 
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Var {estvar (21 + 212 + 221 + 22) } 
= Var {estvar (ru + 22) + estvar (2 + 222) } 
= Var { (au — 22)? + (ta — T22)*} 
- E{ (tu — %3)* — (an — Ten)? } 2 (11) 


where estvar stands for estimated variance. The quantity following £Z in (11), 
of the 4th power in the observations, can be added through successive quad- 
ruplets in the same way as the variance” themselves. No new principle arises in 
its application. The variance of the estimated variance at each stage may be 
calculated from quadruplets such as those of (11) at the several stages sepa- 
rately, and then adding. One need not stop with finding the accuracy of the 
estimate of variance but could go on to estimate the variance of the estimated 
variance of the estimated variance, again without introduction of any new 
formulas. A declining amount of information would be available from a given 
size of sample, and fortunately this corresponds to the lesser amount of interest 
in higher moments. 


8. VARIANCES OF DIFFERENCES OF ESTIMATES BASED ON SUCCESSIVE SURVEYS 


When surveys are carried on periodically users are especially concerned with 
changes, say from one month to the next, and hence require to know the error of 
the estimate of change. If two successive months were sampled independently 
the variance of the estimated change would be equal to the sum of the variances 
of the totals for the two months taken separately. If identical or overlapping 
samples are used from month to month the sampling variations of one month 
are correlated with those of the next, and this correlation serves to reduce the 
error of estimates of differences. Once again we are fortunate, for when a single 
stratum from which two units are selected is considered, it happens that the 
covariance of results for two months is equal to the square root of the product 
of the variances of the separate months. As before this makes it possible to 
factor into a perfect square the contribution to variance of each stratum, and 
we have for the estimate of the variance of the difference between estimates in 
the two months: 


L{L Pau - peldu’)\ (12) 


where the symbols expressing results for one month are primed and for the 
other unprimed. 

Formula (12) is valid for estimates of the difference for any two months, 
successive or not. It may be applied to any one of the stages and to ratio 
estimates or simple estimates. 


9. MISCELLANEOUS 


The formulas of this paper are derived on the assumption that the units are 
chosen with replacement from each stratum. In practice they will hold approxi- 
mately where the selection is without replacement, though if the number of 
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units in the population is small in each stratum it will be an improvement to 
apply the usual finite population correction. They are applicable with the 
usual bias in the estimate of variance when only one unit has been chosen from 
each stratum in the first place, and pairs of such strata have been collapsed; 
and with bias also when systematic sampling has been used and the pairs of 
“strata” are those containing successive members of the systematic sample. 

The method is without bias but suffers some loss of information when the 
original sample includes more than two items from each stratum. Some or all 
of the sample units may be paired at random, and the appropriate variance 
formula applied. The number of pairs to be used for each stratum is determined 
only by the amount of precision needed in the estimate of variance. Thus a 
single pair may be chosen, or all the sample units may be paired, or the sample 
units from a stratum may be added into two groups, and the difference between 
these used. When 2n >2 units originally drawn from a homogeneous normally 
distributed population are randomly paired there is one degree of freedom for 
each of the n pairs, so that the loss of efficiency is (n—1)/(2n—1), which is 
always less than one-half. 


10. CONCLUSION 


Elementary methods have been used to derive formula (10) which may be 
thought of as a computing device for finding the joint effect of a number of 
variances and co-variances. 





APPLICATIONS OF MULTIVARIATE POLYKAYS TO THE 
THEORY OF UNBIASED RATIO-TYPE ESTIMATION 


D. 8. Rosson 
Cornell University 


The multivariate polykays, or multipart k-statistics, are obtained as 
a slight extension of results given by Tukey [4] for the univariate 
polykays. The relationship between this system and the system of 
multivariate symmetric means is indicated and multiplication formulss 
are given. An application of these results to the construction of unbiased 
ratio-type estimators and variance estimates for finite populations is 
given as a further illustration of the usefulness of polykays. 


INTRODUCTION 


ULTIVARIATE estimation problems arise in the field of survey sampling 

where each sample element is commonly measured for a variety of char- 
acteristics. Frequently census data is available for some of the variates being 
measured and may then be used to adjust population estimates for related 
characters. Such adjustments generally lead to ratio-type estimators, the fa- 
miliar forms being 


y = 


and 
9 = Xi, 


where X is a known population mean, # and # are sample means, and 7 is the 
sample mean ratio of y to 2, #=(1/n) >>*(y/zx). Both of these estimators 7 and 
§ are biased by an amount which was apparently unknown until recently pub- 
lished by Hartley and Ross [2]. These authors gave exact expressions for the 
bias and also computed an adjustment to 9 which eliminated its bias; thus, 


g = X64") 5 4) 
= AP + ——— (9 — P72). 
N(n — 1) 

The exact variance formula for 9’, under the assumption that the population 
size N is large relative to n, appears in a paper by Goodman and Hartley [1] 
and is here computed independently by the author for the case N finite using 
multivariate polykays. 

Approximate variance formulae for both 7 and # have long been known, both 
generally taking the same form 


ap 4 oar ay 
511 


=< os" 9 Cor (z, *) 


var (9) ~ var (9) ~ + 
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A third type of adjustment to census data is illustrated by the estimator 
ty 


9 = = 


xX 


which is well known to have a variance of approximately 


i o2 9 Cov (x, ) 


var (y*) ~ — 
n 


ete +t 2 I. 
Y xX XY 
Clearly, y* will generally have greater precision than either 7 or 9 if the correla- 
tion between the variables z and y is negative, and it is therefore considered 
worthwhile to present here a corresponding unbiased “product” estimator and 
an unbiased estimator of its variance. Finally, as a third application of multi- 
variate symmetric means and polykays we compute an unbiased estimator cor- 
responding to 
3 £52 

XY 


MULTIVARIATE SYMMETRIC MEANS 


* 


The polynomial 


: 7 (aaj,28 + + + Lmzj2™) (xj + «+ Dmg) + © + (ayy,2 + + + Dmzmr) (1) 
(n), Jivt> + +#iy 

in the mn variates 2;;,i=1, --+,m;j=1,---,, is called a symmetric mean. 
If these mn variates represent a random sample of n observations from an 
m-dimensional finite population of size N then the mean value of the statistic 
(1) taken over all (,.”) possible samples is clearly the corresponding symmetric 
mean for the population. Herein, of course, lies the utility of the concept, since 
any polynomial function of the observations can be expressed as a linear com- 
bination of symmetric sample means and its theoretical expectation immedi- 
ately determined simply by replacing each symmetric sample mean with the 
corresponding symmetric population mean. 

Tukey [4] and Hooke [3] have adopted the abbreviated notation 
((m), +--+, (a,)) for this function in the univariate case m=1, and the same 
symbol may be applied to the multivariate function (1) with the understanding 
that a, ---, a, is a-set of vectors with a;=(a1;, - - + , @mj). A partition of the 
set a= {a,---,a,' into k nonempty subsets a, - - - , a shall be denoted 
by the symbol ax. The partition a,={a, - - - , a} may then be used to de- 
fine a new symmetric mean ((a*), - - - , (ax*)) where a;* is the vector sum of 
all elements a; contained in the subset a“ of a, 


-( a Serre = oni). 
jlajea™ jlajea 


To simplify the notation further, then, we may use the symbol ((a,)) to denote 
the symmetric mean ((a;*), - - +, (ax*)) defined in this manner by the par- 
tition a,. Thus, the polynomial (1) becomes abbreviated to ((a,)). 
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In order to express conveniently the product of two symmetric means 
((ai), + + + » (@r)){(bi), - + + , (be)) = (Car) ){(Bs)), 78, we shall require a notation 
for a set of vectors obtained as a reduction of the set a+b by pairing and adding 
v elements of a with v elements of b. Thus, if we let 


Po( er, Bz) ™ { (as, - bi,), er (a:, + b;,), (ai,41), oe (a;,), (b;,41), aT ee (bjs) } 


with ((p.(a,, 8.))) denoting the corresponding symmetric mean, and if 
R.(ar, Bs) = { pv(ar, Bs) } is the collection of all the v!(5)(’) possible sets p,(a,, B.) 
then we may write the multiplication formula 


1 r 
{(ar)){(B.)) = } (2) rpe+v Zz “(pv(ar, Bs))). (2) 


(n)-(n). v=0 R, (a,Ba) 

For purposes of computing an expectation such as E{ #92} it is convenient 
to have available a multiplication formula for an arbitrary number of sym- 
metric means each of which is one part; i.e. each of which corresponds to a sum 
over a single index. To this end, we let ((ax*)) = ((a:*)) - - - ((ax*)) and obtain 
the relation 


1 r 
({ar)) = “ DX (n)e Do ((ax)), (3) 


k=l Ax 


where A;= {ax} is the collection of all possible partitions of the set a into k 
nonempty subsets. 

The numerical calculation of ((a,)) from a set of data could, of course, be 
carried out by a direct application of the definition (1), but machine methods 
will in general be far simpler if ((a,)) is expressed in terms of one-part sym- 
metric means, or summations over a single index. For example, > 7,, zy; is 
simply computed when expressed as 

Eau -(Ex)( Eu) - Lew 

i*j i é é 
To obtain such an expression for ((a,)) we are required to distinguish nota- 
tiona!ly those partitions of the set a into k subsets such that k, of the subsets 
contain exactly one element each, kz; contain exactly two elements each, etc., 
ky Ake+ +++ +kp=k, ki t+2ke+ - ++ +rk,=r. To denote such a partition we 
append the subscripts ki, -- +, ky to ax; thus, ap.x,,..-,4,. Again, ((a-#;,--+,k,)) 
= ((a;*)) - + + ((ax*)), and Ag.t,,---.e,= {e,b,---,4,} 18 the set of all 


r! 


Il betty 
1 


possible partitions az.z,,..-,z,. Utilizing the recursion 


m{(ai), + + * » (Gr4)){(ar)) 


1 
——| 
— Fos +++ (eet ay» +, (ans) (4) 


((ai), ae (a,)) — 
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resulting from (2) we obtain the relation 


r 


1 r 
((a)) = Y(- pn oD II [(@ - 1)!]* 


(n), k=l a oe Svkver vl 
» "| Skpmk 


Dd ((etniny. «+ -ste))- (5) 


Akikyy*** k, 


MULTIVARIATE POLYKAYS 
The extension of symmetric mean relations from the univariate to the multi- 
variate case involves no essential change in notation; hence, the symbolic “o” 
multiplication of symmetric means defined by Tukey generalizes in an obvious 
manner to the multivariate case, giving 


((a,), ht sgipiget (a;))o ((bi), ating | (b.)) - {(a), CS ae (a;), (b:), ree (b,)). 


One-part k-statistics, denoted by square brackets [(v)], v=(u1, ---, um), are 
then defined by equating the function 


4 Um 


a t+ + + be 
(h,---,@ = Dees ¥ ye ———— — 1 


v)=0 Vm_=9 Um! 


to the o-logarithm of the genere*tiig function M(h, +++, tn), 


. t,,° 


vy! - ° * Um! 


C-) i"-- 
M(h,--+,t&)= >> --: S” laa, + «+ ip edinnies; 
Um=9 


to give 
2 oo tu--+-+b,%™ 
-1+)>-:-- 2X [vy + + >, m)] —=———— = 0 — log M(t, - - 
~ 1 s 4 = "++ +t," 
-D(- v4 0- If aD EO MERIS, ear prone 


i=1 v j=l \ v=o Um=0 * Um! 


where o- | J is the o-product, 
o— I {Cc} =CoC. 
For example, then, letting eis we have the definitions 
[(11)] = ¢(11)) — ((10)) 0 ((01)) = ((11)) — (10), (01)), 
[(12)] = ¢(12)) — ((10)) © ((02)) — 2((01)) o ¢(11)) 
+ 2((10)) o ((01)) o ((01)) 
= ((12)) — ((10), (02)) — 2((01, (11)) + 2((10), (01), (01)). 


The multipart k, or polykay, [(a:), -- +, (a,)], where aj=(a1;, - - +, Gnj), is 
then defined by o-multiplication of one-part k’s, with 


[(a:), cme (ar) ] 7 [(ai)] 0 > 7 [(a,) J. 
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Thus, for m=2, 
[(11), (01)] = [(11)] o [(01)] 
= {((11)) — (10), (01))} 0 ((01)) 
= ((11), (01)) — ((10, (01), (01)). 


To express symmetric means in terms of polykays we simply apply the pro- 
cedure used by Tukey by writing out 


M(t, ode * , bn) = 0 — exp (@(h, - ‘ * , tm)) 
to give 


x 2 + tn™ 


i" -- 
aes ae ((01y +++ 5 Un) ———— 


v}=0 Um=0 * Um! 
hss: 


-14+D50-T{> cht > [om = = +m) 


t=1 i! v,=0 Um=0 
Thus, for m=2, 


((11)) = [11)] + [(10)] 0 [(01)] 
= [(11)] + [(10)(01)] 
((12)) = [(12)] + [(10)] o [(02)] + [(11)] o [€1)] + [(10)] o [01)] o [1)] 
= [(12)] + [(10), (02)] + [(11), (01)] + [(10), (01), (01)]. 
Expressions for multipart symmetric means in terms of polykays are then ob- 
tained by o-multiplication. For example, 
((11), (01)) = ((11)) 0 ((01)) 
= {[(11)] + [(20), (01)]} o [@1)] 
= [€1), (01)] + [@0), (01), (01)]. 
Ordinary multiplication of polykays may be performed by first expressing each 
polykay in terms of symmetric means, then applying the ordinary multiplica- 
tion formula (2) already given, and then transforming back to polykays by the 
above rule. 
APPLICATIONS TO UNBIASED RATIO ESTIMATION 


The bias in the ratio estimator 9= X# obtained from a random sample of n 
observations (x;, yi, 7:=ys/z;) from a population of size N is 


E{Xr} — Y = (1, 0, 0))’((0, 0, 1)’ — (0, 1, 0)Y 


when expressed in terms of symmetric population means 


((au, a2, a1), aie (air, Aer, sz) )’ 

1 N 
~ (W) Do (ee y tra) + + + (wetirys, rs). 
f byyhe + opt 
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An angle bracket with a prime, ( )’, is used to denote a symmetric population 
mean while ( ) without the prime denotes a sample symmetric mean. Applying 
the multiplication formula (2) we get 


1 N-1 
E{X7} -Y= Hy (OL) + ——— (200), (001))’ — ((010)y 


N-1, 
= —— {4(100), (001))" ~ ¢(010))'}. 


Hence, an unbiased estimator of the bias is simply 


“+ { ((100), (001)) — (010)} 


oie fe — ((010)) 
— . 





— (oi) 


n-—l 
N-1 
= MO fr - 9} 
N(n — 1) 
giving the adjusted unbiased estimator of Y as 
n(N — 1) 


f= Xr - Na} (#7 — 9) = ((100))’((001)) 


N-1 
a eer {({100), (001)) — ((010))}. 


The variance of #’ may then be computed as a straightforward application of 
the multiplication formulas for symmetric means. Thus, 


Ey)? = B {((100))"((100))'((001))((001)) 


N-1 ; 
— 2(*~—*)  (100)y(001))(c100)(001)) 
— ((100))’{(001) ){(010)) } 


+>) { ((100) (001)){(100) (001)) — 2¢(100) (001))(010)) 


+ (o10))4¢010))}} 
where 
E (100) {(100) y ((001)){(001)) 


1 > N=}! , 
= {20m + (a100), coo} 


ty | 
. i ((o02)y +" —* (oo1y(oo1)yt 
n nN 
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N-1 


. uth 


“ad 
((200), (002))’ + 2-—— ((201)(001))’ 


(n—1)(N —2 
if N 


N-1)(N -2 a 
+ SIAN => (100, (100), (002) + 2° —* (aon, ony 





N-1 
{(200), (001), (001))’ + 2 eb ((102), (100))’ 





(n — 1)(N — 2) 
+4 = ((101), (100), (001))’ 





— 1)(N —2)(N -3 
¢ SEAN (100)¢100)(001),000)y} 





= =< { (020) + (WV — 1)((200), (002)y" + 2(n—1)((110), (001)y 


+ (n — 1)(N — 2){(200), (001), (001))’ + 2(W—1)<(011), (100))’ 
+ (N — 1)(N — 2){(100), (100), (002))’ 2(n — 1){(010), (010))’ 
+ 4(n—1)(N — 2){(010), (100), (001))’ 

+ (n — 1)(N — 2)(N — 3){(100), (100), (001), (001))’}, and 


2(-—) 100)’ E((001)){(100) (001 
- y {(100) )’E (001) ){(100) (001) ) 


N-1 1 
- 2(— ) «100)) = {(201), (001)) 


+ ((100), (002))’ + (m — 2)((100), (001), (001))’} 


~ 2 X= * 1 110), 001)’ + (10), (010) 
re {(110), (001) {(010), 


+ (N — 2)((010), (001), (100))’ + ((200), (002))’ + {(100), (011))’ 
+ (N — 2)((100), (100), (002))’ + (n — 2){(200), (001), (001))’ 

+ 2(nm — 2){(100), (001), (010))’ 

+ (n — 2)(N — 3)((100), (100), (001), (001))’}, and 


N-1 
2( = ) ((100))’((001))(010)) 


N- 1 
7 *) ((100))’-— { ((011))’ + (m — 1)((001), (010))’} 
n 


Pra 
= 2. { ((020)) + (N — 1){(100), (011) + (wm — 1)((010), (010) 


+ (n — 1)((001), (110))’ + (m — 1)(N — 2){(100), (001), (010))’}, 
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: ), ((100)(001)){(100)(001)) 





(——). : { (200), (002))’ + ¢(010), (010))’ 
a) age ay {C2001 (002))" + ((010), (0109) 


+ (n — 2){(200)(001)(001))’ + 2(m — 2)((010)(001)(100))’ 
+ (n — 2)((100)(100) (002) )’ + (n — 2)(m — 3){(100)(100)(001)(001))’}, 


< ) (100), (001)){(010)) 


ee { (110), (001))’ + ((100), (011) 
= ra ), (001))’ + (100), (011)) 
+ (m — 2)((100)(001)(010))’ }, and 


N —1\3 
C=) E((010))((010)) 


N-1\? 1 
= (Y= {co2my + m ~ 19010), 109}. 


Combining these terms and subtracting the squared mean 


1 
Po { ((020))’ + (N — 1)((010), (010))’} 


we get 


N a 
Var (9) = ——" { (o2y — ((010), (010)y’ + ¢(200), (001), (001) 


+ 2((100), (010), (001))’ — 2((110), (001))’ 


F< ((100), (100), (001), (001))’ + eet: { (010), (010))’ 
N N(n — 1) g 
— 2((100), (010), (001))’} + ies { ((200), (002))’ 
N(n — 1) 


— ((100), (100), (002))’ — ((200), (001)(001))’ 


+ 2{(100), (100), (001), coo1))’} 


An unbiased estimator of var (9’) is therefore constructed simply by substitut- 
ing sample symmetric means for population symmetric means in this formula 
for var (9’). 
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If population size is infinite then ((a:), ---, (a-))’=((m))’ - - -’((ay))’ 80 
the limiting form of var (#’) as N approaches infinity is 


1 
lim var (9’) = —{4(020))” — (((o10)y)* + {(200)y'(((001)y)* 
+ 2((100))’¢(010)((001))” — 2((110))"((001) 
1 
— ({(001))’)*(((100)))? + ——— { (((010)’)* 


— 2((100) )’((010))’{(001))’ + ¢(200) )’{(002) )’ 
— ({(100))’)((002))” — ((200) )’({(001) )’)* 


+ 2¢((100))*(((001)))*}} 


which is easily shown to reduce to the form given by Goodman and Hartley; 
namely, 


1 “ mm 
lim var (9’) = — Joy + Ro,? — 2R Cov (z, y) 
N-@ n 


1 
oe [o,2o2? + (Cov (r, ah . 


n 


The bias of the estimator y* = #9/X is 
Bi} - V = = (hss — X7} 


1 
= = Cov (2, §) 


1 N-n 5 
= — (10), (01))’} 


so an adjusted, unbiased estimator is 


EF i N - 
an ee > (x —4#)y 
ai 
xX 


y" = 





tira= et N-n ae 
Na—-l)°  Na-lh -»« 


. > {((11)) + (N — 1)((10), (01))}. 
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The variance of y*’ is then computed as follows 

Var (y*) = el -¥ 
= oe { £((11))((11)) + 2(N — 1)E((11)){(10), (01)) 
+ (N — 1)E((10), (01)){(10), (01))} 


1 
— Spr (K(10))'((10))" (C1) ((01)Y'} 
wags (my + (can, any 

N-1_ . aa. 
+ 2——— ((21), (01))’ + 2—— (10), (12) 


((11), (10), (01))’ + 5 CID), (11))’ 
n(n — 


+((20), (02))’ + (n — 2){(20), (01), (01))’ 
+ 2(m — 2)<(11), (01), (10))’ + (m — 2)((10), (10), (02))’ 





@ - te ~ & (N - = 
n 


+ (n — 2)(n — 8){(10), (10), (01), ony} 


N*X? 
+ ((20), (02))’ + 2¢(11), (11)Y} 
+ (N — 1)(N — 2){((20), (01), (01) + 4((11), (10), (01))’ 
+ ((10), (10), (02))’} 
+ (N — 1)(N — 2)(W — 3){(10), (10), (01), coy 


1 
- — { ((22))’ + (N — 1) {2((10), (12))’ + 2((21), (01))’ 


ange (22) + 20 ~ 1){((21), (O1)Y + (C10), (12)Y) 


y" (N — 2)(N +n) +2 
n-1 
2(N — 1)(Nn — 2N — 4n + 4) 
‘4 n-—1 
PR i lh ES 
n-—-l 
(N — 1)(Nn — 2N — 2n + 2) 
t n-—-1 
+ ((20), (01), (01))’} 
2(N — 1)(2Nn — 3N — 3n + 3) 


n-1 





((11), (11))’ 


((11), (10), (01))’ 











{<(10), (10), (02))’ 





(c1oy¢a0y(oxycoyy' | 
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Expressed in terms of polykays this becomes 

N-n ,, NIN-1)+(-1 
+ [(11), (11) ]}’} + 2N{ [21), (01)]’ + [(10), (12)]’} 
+ N*{[(20), (01), (01) }’ + [(10), (10), (02) ]’} 





Var (y*’) = y { [(20), (02) ]’ 


+ 2(N? — 2) fanaa} 


A formula for an unbiased estimator of var (y*’) is obtained by substituting 
for each primed symbol the same symbol without the prime. As the population 
size becomes large the variance of this estimator approaches the limit 


yp tet? zy 


+ 1 ee + (Cov (2, aa 
n—1 xX?y? 4 


Vy? 2 am Cov ’ 
lim var (y*’) = — {5 - Cov (a, y) 
Noo n 





A procedure for computing an unbiased estimator analogous to 


1°. ° + * Bey, 
X1‘A3 + os ais 





is simply to write down the product X; - - - X, asa linear function of multipart 

symmetric population means, replace each population symmetric mean with 

the corresponding sample symmetric mean, and divide the result by Xi - - -Xp4 

Thus, letting a; denote the unit vector, a;=(1, 0, ---, 0), a=(0,1,---,0); 
oe a,=(0,0,---, 1), and a,={(qi),-*- ’ (a,)}, then by (3) 


= ~ ke 
Xi +++ Fe = (ary) = D (Nye De ((ax))’. 


The estimator 





ge DN) de ((ax)) 


£, 
is then unbiased. For an infinite population this estimator reduces to 


(ar) ) 


if == 
jet, ERR, 


and in this case the variance may be written 
X,? - - - X,.?- var (2,") = var (((a,))) 
where E((a,))=((a,))’ =((ar)’) = Xi - + + X,. For r=3 it may be shown that 
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P X;? o;” o2? o3? 2o12 2013 2023 
var (8) = — {= set tS + SS 
X;? X:? X;? X 1X2 X Xo XeX3 


1 [= + 032? — ayo? + 013? ~— 7 2?3? + 723? 


Vox Yexe ) Kee 





te 





n—1 





03°23 + o12013  92?o13 + 612023 3012 + 13023 | 
2 — — 
, ( Yeu, | wxex, ’ we ) 
4 1 [ome + 01723? + 027013" + 037012? + “eae 
n(n — 1)(n — 2) ; 





X17X2°X;3? 


where o,;?=var (2;), oi; =Cov (2x, 2;). 

In general, by the use of symmetric means or polykays, any estimator which 
can be expressed as a multivariate polynomial may be adjusted so as to be un- 
biased, and its variance formula computed in such a manner as to provide 
directly an unbiased variance estimator. 
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ESTIMATION OF PARAMETERS FROM INCOMPLETE 
MULTIVARIATE SAMPLES 


Grorce E. NicnHo.son, Jr. 
University of North Carolina 


The problem considered is that of using all of the observations in a 
sample of multivariate observations when for some of the observations 
values for certain of the characters are missing. This paper extends 
results obtained for estimation of the parameters in special cases to 
the general p-variate case and presents the solution to the problem of 
what use may be made of data from incomplete multivariate samples 
when the interest lies in the prediction of one character from a knowl- 
edge of the rest. It is concluded that for purposes of prediction no use 
can be made of incomplete samples. Under certain circumstances all 
the observations in an incomplete sample can be used to construct im- 
proved estimators. Application of the results to multivariate normal 
distributions truncated on one or more characters is mentioned. 


1, INTRODUCTION 


N investigations yielding samples from multivariate distributions, for exam~ 
ple, archaeological investigations, sample surveys, and psychological testing, 
cases arise in which one or more of the observed characters for some of the ob- 


servations are missing. 

Previous writers dealing with this problem include Wilks [7], Rao [6], 
Matthai [4] and Lord [3]. Wilks and Rao consider the bivariate case, Matthai 
the trivariate case, and Lord a special trivariate case. 

An example of a practical problem which gives rise to an incomplete sample 
is the following. Suppose we have a battery of p psychological tests which are 
designed to predict the performance of college freshmen in their first year. 
Suppose N;, college freshmen take each of the p tests at the beginning of the 
school year. At the end of the year some criterion score is available and this 
results in a sample of N; observations from a p+1 variate distribution. From 
this sample, we can estimate the p+1 means and the p(p+1)/2 different ele- 
ments of the covariance matrix which, if we make the assumption of multivari- 
ate normal distribution, yields estimates of all the unknown parameters. In 
this problem the prediction of the criterion of success is usually of major in- 
terest. The usefulness of the p tests in this connection may be assessed by the 
size of the multiple correlation coefficient between the criterion scores and the 
test scores used in the prediction battery. The estimation of this multiple corre- 
lation coefficient and the estimation of the coefficients of the regression equa- 
tion of the criterion on the tests or predictors is of central importance. 

When the next class of freshmen takes the battery of tests, Nz additional 
observations are available. Regarded as a sample of observations on p+1 
variates, they constitute an incomplete sample since it is not possible to com- 
plete the sample by obtaining the values of the criterion character until the 
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end of the school year. This second group of observations affords a basis for 
estimating covariances between the test scores, and since these covariances are 
involved in the expressions for the regression coefficients the question naturally 
arises as to whether or not the Ni+N: observations taken together could be 
used to get better estimates of the regression coefficients than the first sample 
alone. Also of interest in this problem is the possibility of improving the esti- 
mate of the mean of the criterion since the observations are correlated among 
themselves and the second set should provide some information about the 
criterion. 


2, MAXIMUM LIKELIHOOD ESTIMATES 


Let y be the criterion variable and let X,, Xo, - - - , X, be the p predictor 
variables. If we assume these p+1 variables have a p+1 variate normal dis- 
tribution, then we have p+1 means (yy, 1, we, ** *, Mp) and p(p+1)/2 vari- 
ances and co-variances (oyy, oy, ***, Typ} Tu, °° * y ity * * * y Opp) 

The maximum likelihood estimators of the (p+ 1)(p+2)/2 unknown pa- 
rameters using the first sample alone and both saiup!es are given below. Sta- 
tistics calculated from the second sample of N2 observations are distinguished 
from those calculated from the first sample of Ni observations by an asterisk. 


TABLE 1 
COMPARISON OF ESTIMATES 








Maximum 
Likelihood 
Estimates 


Using the First 


ing Both S 
Sample Alone Using Both Samples 





5) 9+ {N2/(Ni+N;)} 5 b,(2;* —2;) 


2; (Nik; +N28;*) /(Ni+N2) 
Viy/n [Vey —b(n* Vex —2 Via") d’ | /n(n +n*) 
Vixb’/n (Ve2+V22*)b’/(n-+n*) 

> pe n *\h’ 
bVub’/Vin jt {0(Vn+ Va yb"} / Vu 


a —p*)2u (Vir —b Vand’) /n (Vi —bV aed’) /n 
Bs b b 





b(n* Vex —n V2") b’ t 
n+n* 











Here the covariance matrix of the p+-1 variate normal distribution is de- 
noted by 


Syy * Fly Jay °° * * Opy 
* on Oi2* * * ip 


. “O12 Ome**: 
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partitioned as above. The matrix of sums of squares and cross products com- 
puted from the sample is V and is partitioned similarly to the 2 matrix. The 
b; are the sample regression coefficients corresponding to the x; calculated from 
the regression of y on %, - - - , 2» in the first sample. The parameters estimated 
by the b; are the 6; the population regression coefficients and p? is the multiple 
correlation coefficient y on 2, - « - , Zp. The n and n* are the degrees of freedom 
in the respective samples and are equal to Ni—1 and N2—1 respectively. 


3. THE ESTIMATION EQUATIONS 


The estimation of the parameters is accomplished by writing down the likeli- 
hood function for the sample, maximizing this with respect to the unknown 
parameters and solving the resulting set of equations. The equations are com- 
plicated but may be handled conveniently in matrix notation. Details of the 
derivation of the estimation procedure are given elsewhere [5] and only an 
indication of the procedure will be given here. 

The likelihood of the incomplete sample of Ni+N,2 observations can be ex- 
pressed as the product of two quantities since it is known that sample means 
are distributed independently of sample covariances under the assumptions 
made here. Using this fact, the estimate for wu, and the estimates for yu; are 
readily obtained. The variances of these estimates are oy, [1/Ni1—N2p?/(Ni+N>) | 
and o;;/{N:+Nz}, respectively. From this, the advantage of using the entire 
incomplete sample for estimating the mean of y may be studied. 

The estimators for the elements of the covariance matrix are obtained as 
above and turn out as displayed in Table 1. The maximum likelihood equations 
may be expressed as a matrix equation. 


i + * 2422227! D9’ — Dyed! V * 229122’ fl 5? V 
NZ 12! +n* 212’ — V* LZ’ N22+n*22— V* 


The solution of this leads to the estimates given. 


4. CONCLUSIONS 


The conclusion to be drawn from this investigation is that no use can be 
made of incomplete samples when the interest lies in prediction provided the 
prediction of a dependent variate is to be made on the basis of the regression 
equation and the assumption of an underlying multivariate distribution is 
made from the estimates given in Table 1. The estimators for (1—p*) Zu, the 
variance of the error of estimate, do not utilize any of the observations in the 
incomplete part of the sample. This is also true of the estimators for the re- 
gression coefficients ),. 

If the interest is in estimation of the mean value of a variate and if complete 
samples are difficult, expensive, or impossible to obtain in large enough sizes to 
ensure some required accuracy, additional information can be obtained by 
using all the sample information as indicated above. Further details about the 
estimation procedure for the most general cases is given in [5]. 
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Finally, these estimators can be used when samples are taken from multi- 
variate normal distributions truncated on one or more characters; see, for 
example, [2]. 
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TRUNCATION TO MEET REQUIREMENTS ON MEANS 


FranK EvuGenr CLARK 
Rutgers University 


1, INTRODUCTION 


EVERAL papers have recently been written on estimating the parameters of 
S a population from examination of a random sample drawn from a truncated 
portion of the population. If the sample is drawn from a normal, Poisson, or 
binomial population which has been truncated, estimates for the parameters of 
the original population are given in references [1], [2], and [6]. In this paper, 
we consider a converse problem: namely, if the parameters of a population are 
assumed to be known, how shall it be truncated to meet, with prescribed 
probabilities, certain sampling requirements? In particular, we consider one- 
tail and two-tail truncation of a normal population to meet specified require- 
ments on sample means. Subsequently, it is proposed to discuss the problems 
of meeting additional specified requirements on sample dispersion. 


2, MEAN AND VARIANCE OF TRUNCATED NORMAL POPULATION 


The results of this section are included primarily for background purposes. 
These results are employed later to obtain the particular contributions of the 
paper. 

Let 


f(z) = Fs e” and F(z) = f “flu; 


i.e., let f and F be the density and distribution functions, respectively, for the 
normal population with mean 0 and variance 1. We wish to find the mean and 








a ff feb b 
Fig. 1. Normal population truncated on the left at a and on the right at b. 


variance of the truncated normal population which results from double trunca- 
tion at z=a and z=b for — ~ <a<b<o (see Fig. 1). Let ua and oa, be the 
mean and variance respectively of the truncated population. 
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Then these parameters are given by (2.1) for the gereral case of double 
truncation, by (2.2) for single truncation on the left (b= «), and by (2.3) for 
the half-normal population (a=0, b= ): 


f(b) — f(a) i ou bf(b) — af(a) _ 
F(b) — F(a) F(b) — F(a) 
_f@ = f(a) 
1— F(a) Rae “oa 


0 2 
(2.3) pe = 0 = y/ 8, Gon’ = 1 — — (01m & 6) 


(2.1) po = — 


(2.2) Kan = 


Note that the numerical value of the mean is the ratio obtained by dividing 
the difference in the ordinates of the normal curve at the points of truncation 
by the area between these points. No similar simple geometric interpretation 
for the variance is apparent. 

Values of ua» and o,» are given in Table 1. A graph is given in Figure 2, show- 
ing contour lines for 4.,=constant and ¢.=constant. Special applications for 
b= will be discussed first. 


3. SINGLE TRUNCATION TO MEET REQUIREMENTS ON SAMPLE AVERAGE 


Assume that a manufacturer is producing items which are normally dis- 
tributed in a particular characteristic with known mean » and known standard 
deviation ¢. These assumptions are somewhat unrealistic, since in general these 
parameters are not known but are estimated by some sample previously se- 
lected. Nevertheless, for simplicity, we make the assumptions. The acceptance 
of a lot of items requires that the average of a random sample of size n drawn 
from the lot being submitted shall have a mean X satisfying the inequality 
X<=LAL. Such a specification is imposed as part of a single sampling plan.* 
This plan also includes specifications on the sample variability which we shall 
at present ignore. Screening of a lot to increase its probability of acceptance 
can be done effectively and cheaply and is recognized procedure by the cus- 
tomer (the government, in [7]). The specification on averages is made to insure 
that a large quantity of accepted items will not be found too close to specifica- 
tion limits on individuals. In other words the customer forces the manu- 
facturer to do his screening inside the limits which would be set on individuals, 
If the specifications on averages are properly made, then specifications on indi- 
viduals can be ignored since they will automatically be satisfied when the 
specifications of averages are met. We assume the lot size is large enough to be 
treated as the entire population in order that we may suppose the samples are 
drawn directly from the population. If the lot being submitted by the manu- 
facturer under the acceptance plan has a high risk of rejection, then he desires 
to reduce this risk to some prescribed value r by screening his production before 
submitting the lot. For this discussion, we assume the screening, or truncation, 
can be completely effective; in other words that we can exclude precisely those 





* Specifications of the Armed Forces on the purchase of “reliable electronic tubes” include such a requirement, 
where LAL is lower average level [7]. See also [4] for a discussion of how specifications are made and (3) for a related 
problem. 
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Fic. 2. Curves of constant us and os, with dotted lines giving the ‘legree of truncation p. 
(The block surrounding the point a= —.55, b=1.80 is enlarged in Fig. 5.) 


members of the normal population which lie outside arbitrarily chosen limits. 
We shall use capital letters to stand for measurements. The corresponding 
small letters of Section 2 will then represent the departure of these measure- 
ments from the population mean, in multiples of ¢. Thus the manufacturer 
desires to find a point of truncation X =A 30 that the mean X, of a random 
sample drawn from the truncated population will meet the specification 
X.4SLAL with risk r. The upper two portions of Fig. 3 illustrate this situation, 
where UAL is to be disregarded at present. 

Let us first investigate the consequences of a choice of A. The value of r de- 
pends uniquely upon yz, o, and A. To express measurements as multiples of ¢, 
let a=(A—y)/o, and let h=(~%—LAL)/c. Then, in multiples of c, (2.2) gives 
Haw th as the distance from LAL to the mean of the truncated popuiation and 
Oe 28 its standard deviation. The distribution of sample means drawn from the 
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Fig. 3. Example of single and of double truncation problem. 


truncated population will have standard deviation ¢..././n and will be essen- 
tially normal if n>30 [5; p. 201]. Let w= —(uanth)Vn/cae. Then r=F(u), 
where F is the standard normal distribution function, as given in Section 2. 

Fig. 4 gives curves showing the relation between a and h for u/\/n= —.181, 
—.217, —.278, —.331, —.393. These values were chosen to give convenient 
values for r for selected values of n. The table included with Figure 4 gives the 
value of r along each of these curves. The scale on the right gives the degree of 
truncation p= F(a). 

A simple procedural routine for solving our truncation problem may now 
be outlined. We suppose n is fixed and a value for r (and hence for u) has been 
prescribed. This amounts to selecting one of the curves in Figure 4. For exam- 
ple, if n=35 and r=.05, curve 3 is chosen by using the table above. Suppose 
also, as in Fig. 3, LAL=163, »=161, =10. 

(i) Compute h=(u—LAL)/o=(161—163)/10 = —0.2. 
(ii) Find the corresponding value of a on the curve chosen. On curve 3, 
if h= —0.2, then a= —0.71. 

(iii) A =p+ae=16¥4-(—0.71) (10) = 153.9. 

Thus, screening his production at 153.9 gives our manufacturer a 5 per cent 
risk of lot rejection. The scale on the right of Figure 4 shows that the degree of 
truncation p is .24, i.e., this much of the production must be scrapped or re- 
worked. The lower specification limit on individuals (LSL) is shown as 149, and 
the screening point is well within this limit. 

Statistical decision theory can readily be applied to this problem if the manu- 
facturer can place a loss value on having a lot rejected and also on screening his 
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Fig. 4. Points for single truncation with prescribed risk, 
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production for scrap or rework. He will then minimize the resulting loss func- 
tion. Evidently he cannot make his risk of lot rejection too small or he will 
truncate too much of his production; and yet he must truncate enough so that 
this risk will not be too great. Between these extremes, the loss function will 
determine a truncation point at which he achieves his minimum loss (maximum 
profit). 

For example, let R be the cost of having a lot rejected, P the cost of rework- 
ing or scrapping the lot, and L the expected loss. Then L=rR+pP with r and p 
as above. If in the example above we assume R =P, the table below gives L as 
r takes on the values given in Fig. 4. 








-19 -21 -24 -27 29 


142 10 05 025 01 
33R ‘31R _29R 295 R .30R 





In this case, the smallest ZL corresponds to the prescribed value r=.05. 


4. DOUBLE TRUNCATION TO MEET REQUIREMENTS ON SAMPLE MEANS 


Assume that the previously mentioned manufacturer is required to satisfy 
both an upper and a lower average limit, designated respectively by UAL and 
LAL. A random sample with mean X must satisfy the inequalities LALS X¥ 
S UAL. To meet both requirements when the difference UAL—LAL is small, 
it will be necessary to truncate at both tails simultaneously as indicated in the 
bottom portion of Figure 3. Iterated single truncation at each tail will be in- 
adequate since a truncation at A which is satisfactory for the LAL will be un- 
satisfactory when a subsequent truncation is made at B to satisfy the UAL. 
Thus the points A and B must be determined simultaneously. The total risk 
is r=r:+r, (see the bottom portion of Fig. 3), where r depends on yu, ¢, A, and 
B. Prescribing r for given w and o does not determine A and B uniquely but 
determines a curve in the (A, B) plane along which r has the prescribed value. 
The manufacturer now chooses the point on this curve which minimizes the 
total degree of truncation p= p,+p2=1—(F(b) —F(a)) (see Fig. 3). 

Suppose A and B have been chosen. Let 
(4.1) ee. Sai, pete 1 and 


¢ Co 


(4.2) ah UAL LAL Pag (UAL + LAL) 

Cg Cc 
Then, in multiples of ¢, (2.1) gives ua+h as the distance from }(UAL+LAL) 
to the mean of the truncated population and o,, as its standard deviation. The 
standard deviation for sample means from the truncated population is oa/V/n; 
this distribution again will be essentially normal for the values of n we wish 
to consider. Also let 








— w — (Ha + h) pe mete 
oa/V/n ca/V/n ” 


(4.3) u= 
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then 


(4.4) r=1— (F(v) — F(u)). 


Now, as in Section 3, suppose r is prescribed and A and B are to be deter- 
mined. Equation (4.4) defines a curve of constant r in the (u, v) plane. Equa- 
tions (4.3) give a transformation of this curve to the (a, b) plane. We next find 
the point (a, b) on this curve for which p=1—[F(b)—F(a)] is a minimum. 
From this by (4.1) find A and B. 

If o is small, relative to w, and » is near LAL, the manufacturer will employ 
single truncation as though no upper limit (UAL) were given. If ¢ is large (of 
the order of 2w or more), the minimum value of p for a prescribed value of r 
will be obtained by truncating so that the mean of the truncated population is 
essentially centered between LAL and UAL (and so that the variance is suffi- 
ciently small). The precise location of the point giving the minimum p is slightly 
off center but in practice this difference can be neglected (see Fig. 5 for an 
illustration). Centering the mean in this way makes v= —u and from (4.3) we 
obtain yas= —h and o=/n w/2v, while (4.4) becomes r=1—2F(v). 

As in Section 3, a simple procedural routine can now be outlined for solving 
the problem. For example, suppose n=35, r=.05, u=161, c=10, LAL=163, 
UAL = 167. 


(i) Compute w and h by (4.2). Here w= (167 —163)/10=0.4, 

h=(161—165)/10= —0.4. 

(ii) Find v. Since 1—2F(v) =.05, v=1.96. 

(iii) por =0.4; oas=/n w/2v= /35(0.4)/2(1.96) =.604. 
Note: This computation may, on occasion, give o.>1. In this case, 
centering the mean between LAL and UAL would reduce v below the 
prescribed value and single truncation would be called for. For example, 
if UAL=170, then we would get o.4=1.05. The single truncation of 
Section 3 with ;=.05 at LAL would give only .0001 risk of rejection at 
UAL. Therefore, we need not screen the upper values. 
Find (a, b). From Figure 2, a= —0.55, b=1.84 (Fig. 5 shows this as 
point P,). Note: The intersection of the curves for ua, and o may not 
appear on Figure 2. In this case, as a practical matter, single truncation 
will suffice. As an added precaution, the manufacturer might screen at 
USL, the upper limit for individuals, while raising his lower limit 
slightly, say .05¢. 

(v) A=h+ae=161+(—0.55)(10) = 155.5, 
B=h=be=161+(1.84)(10) =179.4. 


From the dotted lines on Fig. 2, we would estimate the degree of truncation p 
to be 32% (Fig. 5 shows p=32.4%). Note that the narrow width w of the 
specifications has enforced a greater degree of truncation than in the previous 
example in Section 3. Note also that the screening is within the specification 
limits on individuals. 
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Fia. 5. Precise location of points for double truncation (example). 
5. CONCLUSION 
We have shown how a normal population may be truncated to meet upper 
and lower requirements on sample means. The author has work in progress on 
the problem of meeting requirements on dispersion. Also the related problems 
for other populations (Poisson and binomial) are being investigated. 


6. APPENDIX 


Derivation of mean and variance for the truncated population 


As in Section 2, let f(x) =(2xr)-te 2” and F(x) =f2.. f(u)du. Let fas(z) 
= (F(b) —F(a))-'f(x) fora SxSb and f..(rx) =0 elsewhere; then f., is the density 
function for the doubly truncated normal population, truncated on the left 
tail at a and on the right tail at b. We now derive the mean yz, and the variance 
oa” for this population. 
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(6.1) pas -f tfa(x)dx = (F(b) — F(a) f af(x)dx 


= (F(b) — F(a)){(— f(a) k = — (P® — (F@)7) - f(@)). 
(6.2) oa? = f 2%fa(z)dz — wa? = (F(b) — F(a)" f 2if(z)ds — jos” 


= (F(b) - Foy-(I'@k + J saz) — pa? 


= (F(b) — F(a))-(— bf(b) + af(a)) + 1 — pas? 
= 1 — par(mas — (f(b) — f(a))-*(bf(b) — af(a))). 


Parameters for the singly truncated population, truncated on the left tail at a, 
are obtained by letting b increase without bound. 


(6.3) Hae = lim pa = (1 — F(a))- f(a). 
bow 


(6.4) Gao = lim oa? = (1 — F(a))-(af(a)) +1 — pon? 


bee 
1 + Gitee — Hew? = 1 — Haro(Mao — @). 


Thus, one-tail truncation (1) shifts the mean by the ratio of the ordinate 
at the truncation point divided by the area retained and (2) decreases the 
variance by the product of the mean of the truncated population multiplied 
by the difference between this mean and the truncation point. 
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THE MIDRANGE OF A SAMPLE AS AN ESTIMATOR 
OF THE POPULATION MIDRANGE 


Pau R. Riper 
Wright Air Development Center 


A study is made of the distribution of the midranges of samples from 
five different symmetric populations of limited range, and of the relative 
efficiency of midrange and mean in estimating the population midrange, 
or mean, or median. (For a symmetric distribution these three averages 
are of course identical.) It is found that the midrange is more efficient 
than the mean for all of the populations considered, and that this 
efficiency increases with decreasing o4( =s/o*), the standardized fourth 
moment. The values of a, for the five populations studied are 2.19, 
2.14, 1.8, 1.19, and 1. 


1, INTRODUCTION 


F 2, and z, are the least and the greatest members, respectively, of a random 
sample of n from the population f(z), then the joint distribution of x, and 
tm is [2, p. 191] 


n(n — 1)[F(2,) — F(a) ]"~*f(a1)f(an)daridrn, (1) 
where 
F(z) = f seas. 
- us denote by u the midrange and by »v the half-range of the sample; that is, 
et 
u = (ta +%)/2, v= (2, — %)/2. (2) 


Making the transformation (2), we find that the joint distribution of midrange 
and half-range is 


2n(n — 1)[F(u + 2) — F(u — v)}*-*f(u — v)f(u + v)dudv. (3) 


The distribution of u, or of v, can be found by integrating, between the ap- 
propriate limits, with respect to the other variable. 


2. COSINE POPULATION 
The first population considered is 


f(x) : me gf (4) 
zr) = — cos z -—<$2S5-—- 
2 . gt "se 


The cosine distribution, which has certain nice mathematical properties, seems 
to be one that has not been studied. 
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Here (1) becomes 
in(n — 1)[#(sin z, — sin 2;)]"-* cos 2 cos 2,d2,d2q, (5) 


which by the use of (2) and of well known trigonometric formulas, can be re- 
duced to 


4n(n — 1)(cos u sin v)"~*(cos? u — sin? v)dudv. (6) 


To obtain the distribution of the midrange u we must integrate this expression 
with respect to v between the limits 0 and (x/2) —| u| . The integration for 
general n does not seem feasible. For the specific values n=2, - - - , 6 the inte- 
gration has been performed; the resulting distributions follow. 


1 ‘ , 
n=2: A [(w—2| u| ) cos 2u+sin 2| u| J, 
1 
: 7° cos 3u+cos u—sin 4| u| +2 sin 2! u| J, 
3 , . 
: aa [4(r—2| u| )(cos4u+cos 2u) —-sin6|u| +2sin4|u|+7sin2| u| J, 


1 
: og (® cos 5u+72 cos 3u+16 cos u—sin 8| u| —18 sin 6| u| 
+2 sin 4| u| +54 sin 2| u|), 


5 
: aan [12(r—2| u| )(3 cos 6u+8 cos 4u+5 cos 2u) 


— sin 10| u| —16 sin 8|u| —15sin6|u| (7) 
+80 sin 4| uw] +146 sin 2| u| J. 


The variances can be found by multiplying the foregoing expressions by u? 
and integrating between the limits —(x/2) and (x/2). This operation has been 
carried out and the results are listed in Table 1, in which, for purposes of com- 
parison, the variances of the mean are also shown. Although there is very little 
difference, the variance of the midrange is slightly less than that of the mean. 


3. PARABOLIC POPULATION 


Perhaps the first symmetric distribution of limited range that comes to mind 
is the Pearson type II distribution, of which the simplest form is the parabolic 
distribution 


“)=F(1-2), -1lszsl. (8) 


In this section we shall derive the distribution of the midrange of samples 
from (8). 
For this special case (3) assumes the form 


= n(n — 1)[40{3(1 — u*) — v?} Jn-*[(1 — u*)?—-2(1 + u*)o® + v*]dudv, (9) 
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TABLE 1 


VARIANCES OF MIDRANGE AND MEAN IN 
SAMPLES FROM COSINE POPULATION 








Sample size (n) Variance of midrange Variance of mean 





xr? et—8 
— — 1 = 0.2337 ——— = 0.2337 
8 8 


5x? 25 r? —§ 
— —— = 0.1532 =0.1 
32 18 12 _ 


13x78 tad 
eee TT = 0.1169 
128 9 16 


63x? 253 
a 0.08998 <a 0.09348 
181x? 359 8 
as - 6.0748 == «= 0.07790 
2048 450 ‘ 24 











which must be integrated with respect to v between the limits 0 and 1—|u|. 
The integration for general 1 does not seem to lead to a simple expression, al- 
though it is obvious that this will be a polynomial of degree 3n—1. For the 
specific values of n from 2 through 5 the integration has been carried out and 
the resulting distributions are as follows. 


: = [15(1—ut)*(1— | w| )—10(1+u*)(1— | uw] )*+3(1— | w] 4], 


8) 
: 64 [36(1—u?)*(1— | u| )?—6(7 —2u?—5u*)(1— | u| ) 


+4(5—u*)(1— | u| )*—3(1—| uw] )8], 


a [10395(1—u*)*(1— | u| )*—8316(2—3u?+u®)(1— | u| )5 


+990(11—10u*—u*)(1— | vj )?—1540(2—u*)(1— | u| )9+315(1— | uw] )"], 
3 
n=5: a [2835(1—u?)5(1— | u| )4—1890(3 —8u?+6u!—us)(1— | wu] )® 


+945(5—9u?+3u'+u)(1— | u| )*--84(23 —28u?+5u*)(1— |u|) 
4+-35(11—7u)*(1— | w| )#*—30(1— | w| 44]. (10) 


As before, the variances can be found by multiplying these expressions by u?* 
and integrating between —1 and 1. These variances are compared with the 
variances of the mean in Table 2. 
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TABLE 2 


VARIANCES OF MIDRANGE AND MEAN IN SAMPLES 
FROM PARABOLIC POPULATION 








Sample size (n) Variance of midrange Variance of mean 





1 
—_ = wl = it | 
10 0 0 


5 
aq 7 0.06494 
641 
13475 
3176 
85085 


= 0.04757 


= 0.03733 











4. RECTANGULAR POPULATION 
Perhaps the earliest consideration of the midrange was by Neyman and 
Pearson [1], who showed that for samples of n from the rectangular population 
fiz)=4, -18291 (11) 
the distribution of the midrange is 
f(u) = in(1— |u|). (12) 


From (12) it is easy to show that the variance of the midrange is 2/(n+1)(n+2) 
while the variance of the mean is 1/3n. Thus, since the variance ef the midrange 
diminishes essentially as n~*, while the variance of the mean ¢Cwninishes only as 
n-!, the midrange rapidly becomes (as the sample size increases) much more 
efficient than the mean in estimating the population mean. In fact, it is well 
known that the sample midrange is the best possible estimate of the center of a 
rectangular distribution. 


5. INVERTED PARABOLIC POPULATION 
The next population which we wish to consider is a U-shaped distribution 


3 
fiz) -” 2 z*, -1 < x < 1, (13) 


which might be termed the inverted parabolic distribution. 
Here (3) becomes 


- n(n — 1)[v(3u? + v*) ]*-*(u? — v*)*dudo, 


and proceeding as before we find the following distributions of u. 





MIDRANGE OF A SAMPLE AS AN ESTIMATOR 
3 
+ = [15usa — |u|) — 10u%1 — |u|)? + 30 — |u| §], 


9 
g [36us(1 — |u|)? — 30u(1 — | uw] )4+ 40271 — | ul 8 


+ 3(1 — |u| )], 
6 
i [10395u8(1 — |u| )* — 8316u%1 — | u| )® — 990u4(1 — | ul)? 
+ 1540u2(1 — | ul )*+315(1 — | uw] )*). 
15 
= [567u!°(1 — | uw] )4 — 378u%1 — | uw] )* — 189u%1 — | ul )8 
+ 84u4(1 — | ul)? + 49u%1 — | ul) +671 — | w])*]. (15) 


Variances are compared with the variances of the mean in Table 3. 


6. DICHOTOMOUS POPULATION 


The distribution for which a, has its minimum value 1 is the discrete distribu- 
tion 


ek is ? forz = +1, 


16 
0 elsewhere. (16) 


The distribution of the midrange u of samples of size n from (16) is 


2 foru = +1, 


Asta {i —2-*! for u = 0. 


(17) 


The variance of (16) is readily computed to be 2-**', the variance of means of 
samples of n is n~'. Comparisons can be made so easily that no table is ex- 
hibited. Obviously the advantage of the midrange increases rapidly with sample 
size. 


7. CONCLUSION 


For all of the populations studied the midrange is more efficient than the 
mean. This efficiency increases as a, decreases, as is seen in Table 4, in which 
the efficiency of the mean is given relative to the midrange in order to keep the 
efficiency values from exceeding 100 per cent. 

The relative efficiency of the midrange for samples from a s;iven population 
increases with sample size, although not very rapidly for a,>2. 

It would seem that if it is known that samples are being drawn from a sym- 
metric population for which a,<2, it would be desirable to use the midrange as 
an estimate of the population mean. It might be remarked here that the mid- 
range is slightly easier to compute than the mean. 
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TABLE 3 


VARIANCES OF MIDRANGE AND MEAN IN SAMPLES 
FROM INVERTED PARABOLIC POPULATION 








Sample size (n) Variance of midrange Variance of mean 





= 0.3 


3 3 
1 
5 


117 
—— = 0.1519 
770 
213 


——— = 0.07904 — 
2695 20 


723 3 
Troy 710-04249 = 


= 0.2 











TABLE 4 
EFFICIENCY OF MEAN RELATIV=S TO MIDRANGE 








Population 





Inverted 
Parabolis Rectangular parabolic Dichotomous 


oa =2.14 a=1.8 a=1 19 a=l 





100 100 100 100 
97.4 90 76.0 75 
95.1 80 52.7 50 
93.2 71 35.4 

64 
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NOTE ON GROUPING 


D. R. Cox 
Birkbeck College, University of London* 


Suppose that it is required to condense observations of a variate 
into a small number of groups, the grouping intervals to be chosen to 
retain as much information as possible. One way of formulating this 
requirement mathematically is given and numerical recommendations 
are made for use when the variate is normally distributed. 


1. INTRODUCTION 


eT X denote a continuous variate and suppose that for convenience of ex- 

position it is desired to classify the individuals in the population into k 
groups. How should this be done? For some purposes it is desirable, for ease of 
calculation or graphical representation, to divide the observed range of varia- 
tion into groups of equal width; in other cases, such as those we shall be con- 
cerned with, this has no particular advantages. Thus X may be some property 
connected with the health of an individual and, with k=3, we may wish to 
classify individuals as “poor,” “average,” or “good.” The question also arises 
as to how much additional information can be conveyed by having say four 
groups instead of three. 

We assume that the grouping has to be made on the basis of a large sample of 
values of X. Often there would be in addition external information about the 
practical interpretation of different values of X; thus if X is blood pressure we 
know roughly that certain values of X represent dangerously low or dangerously 
high values. It will be assumed here that such external information is absent. 


2. CRITERION FOR GROUPING 


When the grouping is done for convenience of exposition, any mathematical 
condition set up to define the “best” system of grouping is bound to be some- 
what artificial, but the following seems a reasonable approach. With the ith 
group associate a value é; and imagine that each individual put in the ith 
group is given the value ¢;. That is, we have a variate ¢(X), a function of the 
variate X, defined by £() =; when z is in the range corresponding to the ith 
group. We can now measure the loss due to grouping an individual with value x 
into the ith group by f(x—£,, x) where f is a suitable function; we shall take 


f(z — &, 2) = (x — &)*/o?, (1) 
where a is the standard deviation of X. The fact that the function in (1) is inde- 
pendent of the second argument amounts to the statement that there is no 
external information of the sort discussed in Section 1, differences of a certain 
magnitude (x—£,) being of equal importance in any part of the scale. If the 
object of the grouping is not ease of exposition, but is the calculation of some 
statistical test or estimate, the efficiency of the statistical procedure, and not 
(1) should, of course, be used in investigating grouping. 

* This work was done in the Department of Biostatistics, School of Public Health, University of North Carolina. 
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The average loss from grouping is obtained by averaging (1), i.e. by 
L, = E[X — &X) ]*/o*. (2) 


To minimize L for fixed k, proceed in two stages. First for any fixed system of 

grouping, choose &, - - - , & to minimize L; then choose the system of grouping. 

The first step is easy. Choose £; to be the mean of all observations in the ith 

group; this achieves the first part of the minimization, since the second mo- 

ment about the mean is less than that about any other point. From now on 

§:, £(X) will be used only to denote these mean values; also we write = E(X). 
If p; denotes the probability of an observation falling in the ith group 


k 
L, = > psE[(X — &)?| X in the ith group ]/o? 
t—1 


DX pil&s — §)? 
‘2. teal (3) 


oe 





=1-—MWM, say. 


Equation (3) follows from squaring the identity X —§=(X —£,)+(&:—& and 
averaging, i.e. from the analysis of variance identity. We wish to maximize M. 
Note that if k=1, L,=1 representing complete loss of information about dif- 
ferences among individuals. 


3. DETAILED CALCULATIONS 


Now assume that X is normally distributed. Without loss of generality we 
may base our calculations on the unit normal distribution, writing 


1 
ES, —1/22" 
g(x) Van) e . (4) 


G(x) = f ; g(u)du. (5) 


The conditional mean of X given 1<X <2 is [g(21) —g(22) |/[G(a2) —G(21) J. 

Consider the problem of maximizing M for successively increasing values of 
k. For k=1, 2 the problem is trivial. Thus with k=2, we have only to choose 
one division point and by symmetry this must be taken as the mean, 0. (It is 
assumed without proof that it is a bad thing to form a group from several non- 
contiguous intervals.) For k=3, we have to choose y>0O such that M is maxi- 


mized if we take our groups to be {— ©, —y), (—y, y), (y, ©); again een 
conditions have been used to phat! the problem. We have 


= 2[9(y) }?/G(— y). (6) 


Calculation shows that this has a maximum value of 0.8098 attained at 
y=0.612. Thus for a general normal distribution the three groups should be 
(— ©, £—0.612c), (E—0.6120, §£+-0.6120) and (£+0.612 o, ©), the percentages 
of individuals in the three groups being 27.0, 45.9 and 27.0. Similarly for k=4, 
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the groups may, for the unit normal, be taken as (— ©, —y), (—y, 0), (0, y), 
(y, ©), where y is to be determined. And for k=5, (—#, —y), (—w1, —y2), 
(—Ys2, ¥2), (Ya, ¥1), (V1, ©), and so on. The maximization problem gets rapidly 
more complicated as the number of adjustable points increases. The boundaries 
of the grouping intervals for k <6, which is probably the practicai range, are as 
follows: 

k = 2:0; 

k = 3:+0.612; 

k = 4: +0.980, 0; 

k = 5:+1.230, +0.395; 

k = 6: +1.449,°+0.660,°0. 

Table I gives the appropriate distribution ofjthe observations among the k 

groups in these cases and the maximum values of M. 


TABLE I 


PERCENTAGE DISTRIBUTION OF OBSERVATIONS IN OPTIMUM 
GROUPING AND PERCENTAGE AMOUNT OF 
INFORMATION RETAINED 100 Mnaaz. 








wf for grouping 
Percentage distribution M max with equal 
frequencies 


Number of 
groups, k 





50.0%; 50.0% 63.66% 63.66% 
27.0%; 45.9%; 27.0% 80.98% 79.32% 
16.4%; 33.6%; 33.6%; 16.4% 88.25% 86.13% 
10.9%; 23.7%; 30.7%; 23.7%; 10.9% 92.01% 89.69% 
7.4%; 18.1%; 24.5%; 24.5%; 18.1%; 7.4% 94.20% 91.94% 





The function M is very flat in the neighborhood of its maximum, so that the 
precise choice of the grouping intervals is not critical. This is illustrated by the 
last column of Table 1 which gives the value of M obtained by dividing into 
groups with equal frequencies of occurrence. Thus, with k=3, if we divide into 
three groups each with one third of the observations, M =79.32%, very near to 
M ssa: 


4, RELATION TO A PROBLEM ON ORDER STATISTICS 


The mathematical problem leading to the results of Section 3 is for a normal 
population identical with that considered in a somewhat different context by 
Karl Pearson [3], Mosteller [1] and Ogawa [2]. In this, instead of requiring 
to maximize the information retained concerning individual values of the 
variate, we want to estimate the population mean by a linear combination of 
(k—1) order statistics chosen from a large sample, the particular statistics for 
use being at our choice. 

Thus, with k=3, we take a suitably weighted combination of k—1=2 order 
statistics, chosen to minimize the variance of the resulting unbiased estimate, 
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and this leads to an efficiency of Max and to the use of the 0.270 nth and the 
0.730 nth order statistics, where n is the sample size. These correspond to the 
boundaries of the optimum grouping intervals of Section 3. This correspondence 
holds for general values of k. 

In Section 3 the particular expressions for M come from formulas for the 
centroids of sections of the normal distribution. In the problem with order 
statistics, the expression for efficiency involves the asymptotic variance-co- 
variance matrix of the order statistics. It seems to be a peculiar property of the 
normal distribution that the resulting expressions should be equal. That they 
are not equal in general can be seen by considering a rectangular population 
with k=3. Here the optimum estimation procedure, but not the optimum 
grouping, is obtained from the extreme observations. 

The values given in Table I correspond, except for discrepancies in the final 
decimal places, with the table of values in [2]. 


5. APPLICATION AND DISCUSSION 


The results of Section 3 require the mean and standard deviation of the 
normal population to be known. When we are given instead a large random 
sample, we estimate the mean and standard deviation in the usual way and sub- 
stitute these estimates in the population formulas. Discussion of the effect of 
errors of estimation does not seem very important in the present context and 
will not be attempted. 

The general effect of population non-normality can be stated qualitatively. 
In Table I the groups in the tails have a lower frequency than the central 
groups. If we deal with platykurtic distributions, with less important tails, it 
will be best to allow the frequency in the tail groups to rise; with the rec- 
tangular population a grouping into equal frequencies is optimum. With a 
leptokurtic population, or with a skew population with one long tail, it will be 
best to have a tail group or groups with lewer frequencies than those given 
in Table I. 

If practical considerations indicate that considerable importance attaches to 
small differences in certain ranges of X, this should, as noted in Sections 1 and 2, 
be taken into account. 


6. TWO DIMENSIONAL PROBLEM 


The problem sometimes arises in the following form. There are two variates 
X, Y and it is required to group individuals on the basis of Y. We are given a 
large sample of pairs (X, Y) and for future individuals will be given only X. 
It is required to form a system of arranging the X values into k groups so as 
to predict as precisely as possible the individual variations of Y. 

Let X, Y follow a bivariate normal frequency surface with correlation co- 
efficient p and variances a,”, a,*, so that the variance of Y conditional on X is 
o,?(1—p?). The problem of grouping the X values to minimize the loss of infor- 
mation about Y is essentially the one-dimensional problem considered earlier. 
In fact, if we assign an individual to the ith group of X values, for which 
the mean of (X, Y) is (£;, 7:), we are in effect replacing the unknown value Y by 
n(x) equal, in this case, to n;. The loss of information is L,= E[Y —n(z) }*/o,? 
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=(1—p*)+p*Z,, as follows on squaring the identity Y—7;=[Y—E(Y |X)] 
+[E(Y |X) —1,] and averaging, using the properties of the bivariate normal 
surface. This is minimized when L, is. Thus if p?=4 and k=4, the best system 
of grouping X gives L,=}+4(1—0.8825) =0.5588. Of the 56% of information 
lost about Y, 50% is accounted for by the dispersion of Y about its regression 
on X and the remaining 6% by the use of grouped values of X instead of the 
continuous observations. 

I am grateful to B. G. Greenberg for suggesting this investigation and for 
very helpful comments. 
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USE OF DUMMY VARIABLES IN 
REGRESSION EQUATIONS 


Dantet B. Suits 
University of Michigan 


The use of dummy variables requires the imposition of additional 
constraints on the parameters of regression equations if determinate 
estimates are to be obtained. Among the possible constraints the most 
useful are (a) to set the constant term of the equation to zero, or 
(b) to omit one of the dummy variables from the equation. In working 
with a single system of classes either constraint can be used, and results 
from the application of one are readily derived from those obtained 
from the other. If several systems of classes are involved the best pro- 
cedure is to delete one dummy variable from each system. 


HE dummy variable is a simple and useful method of introducing into a 

regression analysis information contained in variables that are not con- 
. @ntionally measured on a numerical scale, e.g., race, sex, region, occupation, 
etc. The technique itself is not new but, so far as I am aware, there has never 
been any exposition of the procedure. As a consequence students and research- 
ers trying to use dummy variables are sometimes frustrated in their first at- 
tempts. It is the purpose of this note to point out very briefly some of the 
problems encountered in the use of dummy variables and some of the alterna- 
tive procedures available. A few concluding remarks will be directed to the 
more general application of the dummy variable, to include its use in the analy- 
sis of the influence of variables already conventionally scaled. 


1, A SINGLE SYSTEM OF CLASSES 


Let us suppose we are concerned with the regression of a numerically scaled 
dependent variable Y, on a set of numerical independent variables X,, X2, etc.; 
furthermore the population is partitioned into mutually exclusive classes, and 
we know to which class each item of the sample belongs. We want to study net 
only the influence of X;, X2, etc. on Y but also the effect of class membership. 
To fix ideas and simplify the presentation let us suppose the dependent variable 
to be the number of pounds of sweet potatoes consumed by a family. We wish 
to study the relationship between this and family income, X, and also to de- 
termine the influence of the region in which the family lives. For this purpose 
suppose we have a three region classification of families: Eastern, Southern, 
and Western. 

Since region is not a conventionally scaled attribute we must somehow sup- 
ply it with numerical values if we are to introduce it into a regression equation. 
To do this we define three dummy variables, R;, R2, Rs, with the property that 
R;=1 if the item belongs to the ith region; otherwise R;=0. These variables 
may then be put in the regression as variables in good standing provided the 
proper steps are taken to insure that the solution of the normal equations will 
be determinate. 
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The natural inclination is to set up a regression model of the form 


Y = aX + buh. + bv: + buRs +a + u (1.1) 


but it is immediately clear that this would be a mistake. The optimum esti- 
mates of c, and the b,,; are indeterminate. To demonstrate this we need only 
recall that for each item in the sample, one and only one of the R; has the value 
1, the others being equal to 0. Thus any arbitrary number added to each of the 
bi; and subtracted from c; leaves the value of Y identically unaffected. To look 
at the matter another way, there is perfect linear multiple correlation among 
the R;; any attempt to estimate the regression parameters of (1.1) will fail be- 
cause of singularity in the moments matrix. To obtain determinate estimates of 
the parameters of (1.1) we must impose an additional constraint. This can be 
done, for example, by pre-assigning a value to one of the b;,, or to their average, 
or to ¢, ete. 

Among the possibilities there are two that are particularly useful. (a) We 
may set c,=0. The effect of this preassignment is to convert (1.1) into the 
homogeneous form 


Y = aX + bak; + bak: + bash; + u, (1.2) 


the coefficients of which may be estimated in the usual way. It will be recalled 
that the normal equations used to estimate the regression coefficients of a 
homogeneous form involve moments around zero, rather than around means. 
This saves a step in calculation. Moreover, slnce R,?;=0 for all i+j, the matrix 
of moments among the dummy variables is diagonal. Placing this diagonal 
matrix at the top of a Doolittle format makes the hand calculation of the for- 
ward solution particularly easy. (b) A convenient alternative to (a) is to set one 
of the b:;=0. The one selected may be designated without loss of generality as 
bis, and (1.1) is converted into 


Y = aX + bak, + bsoke + C3 + uU. (1.3) 


This form is an ordinary nonhomogeneous regression equation in which R; does 
not appear as an independent variable. Dropping out the variable R; does not, 
of course, reduce the amount of information incorporated in the analysis since 
its values are identically derivable from R, and R2. 

Since (1.2) and (1.3) represent merely different constraints imposed on (1.1), 
they necessarily yield identical estimates of Y; and while the direct interpreta- 
tion of the two versions differ, parameter estimates for one are readily derived 
from those obtained for the other. The b»; of (1.2) measure regional influences as 
deviations from zero; and (1.2) can be interpreted as a linear regression of Y 
on X, the intercept of which (b2;) varies from region to region. By adding 
— be; to each of the be; of (1.2) and +-be3 to the (zero) constant term of (1.2), we 
obtain form (1.3). Thus b3;=be;—be;, and the bs; measure regional shifts in the 
regression of Y on X as deviations from the intercept of region 3 taken as a 
base. 

Since the parameters of the two forms are related by a linear transformation, 
so are the variances and covariances of the parameter estimates. In particular 
it may be noted that the variances of the estimates of the b;; in (1.3) may be 





550 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1957 


readily calculated from the variance-covariance matrix of the estimates of the 
be; in (1.2) by the usual formula for the variance of the difference between two 
random variates.' 


2, INTERACTIONS 


In addition to regional shifts in the intercept of the regression of Y on X we 
may also be interested in regional variation in slope. This may be investigated 
by introducing an interaction term involving X and the dummy variables. A 
regression mode! with this interaction is represented as 


Y = (a+ dR; + dokz + dsR3)X + WR, + bo, + 3Rs +e + u. (2.1) 


The problem of indeterminacy arises among the d; and a as well as among 
the b; and c, and it is necessary to impose two constraints on (2.1). The same 
convenient alternatives present themselves and we may either set a=c=0, or, 
for some 7, set d;=b;=0. One could, of course, elect io set a=0 and, say, b;=0. 
This would yield a determinate solution but would be awkward to interpret, 
and one should either formulate both interaction term and direct regional ef- 
fects as homogeneous forms, or should drop one of the dummy variables from 
the equation. The differences in interpretation of these alternatives correspond 
to those of section 1. 


3. SEVERAL SYSTEMS OF CLASSES 


Considerations similar to those outlined above hold for situations in which a 
number of different classifications are investigated simultaneously. In the de- 
mand for sweet potatoes of section 1, for example, the items of the sample may 
be assigned to, say, occupational and racial classes. 

In the general case we have ¢ systems of classification, of which the 7th con- 
tains k; mutually exclusive classes. We define ¢ sets of dummy variables 
R,(i=1, 2,---,t;7=1, 2,---, k) so that R,;=1 if the item belongs to the 
jth class of the ith system; in all other cases R;;=0. The generalization of (1.1) 
is then 


imt jek;i 


Y=eX+ > YobRy tet+u. (3.1) 


i=l jel 


It is clear that for any set of constants b,*(i=1, 2, - - - , t) and c* such that 

(= b*+c*=0, Y is identically unaffected by the substitution of b,;+b,* and 
c+c* in place of b,;; and c in (3.1), and determinate results again require a 
system of constraints. This is most easily arranged by dropping out one dummy 
variable from each set; i.e. select a j; for each system of classes 7, and pre- 
assign b,,,=0 ({=1, 2,---, ¢). This is a generalization of procedure (b) of 
section 1. 








1 More generally, let V =(v;j) be the variance-covariance matrix of p t timates of (1.2) so arranged 
that i, j=1, 2,+++*, & are associated with conventionally scaled variables, and i, j=k+1,-*~*,n with dummy 
variables. Without loss of generality let i, 7 =n be associated with that dummy variable which appears in (1.2) but 
not in (1.3). Let A be a matrix obtained from a unit matrix by subtracting the nth row from each of the rows 
k+1, k+2,+++,n-—1. Then if V* is the variance-covariance matrix of parameter estimates of (1.3), V*=<AVA’. 
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4. BROADER ASPECTS OF DUMMY VARIABLES 


One occasionally encounters suspicion of dummy variables and a feeling that 
somehow something not quire respectable is involved in their use. Although this 
reference to other uses of dummy variables sometimes has practical application 
it is intended primarily to dispel this feeling. Perhaps part of the trouble lies in 
the use of the term “dummy” variable. There is nothing artificial about such 
variables; indeed in a fundamental sense they are more properly scaled than 
conventionally measured variables. If we conceive the task of regression analy- 
sis to be that of providing an estimate of a dependent variable, given certain 
information, the use of linear regression yields biased estimates in the event of 
curvature. By partitioning the scale of a conventionally measured variable into 
intervals and defining a set of dummy variables on them, we obtain unbiased 
estimates since the regression coefficients of the dummy variables conform to 
any curvature that is present. 

This procedure can be fruitfully applied to a variable like age, the influence 
of which is frequently U-shaped. Attempts to use chronological age as a linear 
variable may lead not only to the bias mentioned above, but to the failure of 
the variable to show significance in the regression. Although we sometimes 
resort to the use of a quadratic form in age to capture this curvature, there is 
little additional difficulty and in general better results in the application of a 
system of dummy variables defined by age classes. 








FITTING A STRAIGHT LINE TO CERTAIN TYPES 
OF CUMULATIVE DATA 


JoHn MANDEL 
National Bureau of Standards 


Many situations giving rise to linear data involve measurements 
made at progressive stages of a physical or chemical process carried out 
on the same subject of experimentation. In such cases, the experimental 
errors include cumulative components related to the process, and the 
errors corresponding to different points on the line are not independent. 
Failure to take such cumulation of errors into account results in serious 
underestimation of the standard error of the estimated rate of the 
process. The standard errors and correlations of the residuals from 
regression are derived both for situations involving cumulative and 
independent errors. The differences between the two situations are 
striking and may be used as a basis for judging which of the two 
types of error is predominant in a given case. The procedure is illus- 
trated by means of the data obtained in an experiment in physical 
optics, 


1, INTRODUCTION 


COMMON situation in physical, chemical, and related experimentation in- 
A volves the taking of measurements on essentially the same subject or unit 
of material at different stages of a given process. A simple example is provided 
by the testing of a specimen of plastic material for its resistance to abrasion, 
by subjecting it for successive test periods to the action of an abrasive wheel 
and measuring the amount of wear at the end of each period. In chemical 
studies, the progress of a chemical reaction may be studied by taking samples 
of the reacting system at specified time intervals and subjecting the samples 
to chemical analysis. This procedure applies, of course, also to the control of 
large batches of reacting materials in chemical] industrial processes. 

The characteristic feature common to these examples is the use of essentially 
the same subject of experimentation for the entire series of measurements, be 
it a plastic specimen or a batch of reacting materials. The measurements will 
therefore not be statistically independent. 

A second characteristic to be noted in connection with the examples cited is 
the presence, in addition to the usual errors of measurement affecting the obser- 
vations, of a second source of statistical fluctuations, namely, the fluctuations 
involved in the process itself due to small changes in experimental conditions, 
such as temperature, pressure, relative humidity, radiant energy, introduction 
of extraneous materials, and similar factors. In contrast to the usual errors of 
measurement, which occur at the time the system is examined, these additional 
fluctuations generally take place while the process is in progress and are usually 
unrelated to the measurements. We can conveniently denote these two types 
of error as belonging respectively to the “point” type (usual measurement error) 
and the “interval” type (error due to fluctuations in the process). This termi- 
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nology serves to emphasize that the errors due to fluctuations in the process 
are functions of the length of the interval during which the process is allowed 
to proceed between consecutive measurements. 

The two characteristics, errors of the interval type and nonindependence of 
the measurements, are not necessarily coexistent. If, in the abrasion experi- 
ment, a first specimen had been subjected to one test period, a second specimen 
to two test periods, etc., one would still have interval type errors, but all 
measurements would then have been statistically independent. In this paper, 
we will be concerned with situations typified by the use of a single specimen, 
so that if interval type errors are present, they will also of necessity be non- 
independent. For brevity, we will refer to such errors as “cumulative,” and 
to “point” type errors as “independent.” 

If a series of measurements made on the same system is subject to only one 
of the two categories of error just discussed, i.e., independent or cumulative, 
the analysis is relatively straightforward, at least in its basic approach: for 
independent errors, the measurements can be treated by the usual statistical 
methods predicated on independence, while for errors purely of the cumulative 
type, the device of considering differences of successive measurements will lead 
to a new set of independent observations. But it is to be noted that in general 
these new observations will no longer be homoscedastic, since the variance of 
each difference will depend on the length of the corresponding “interval.” 

In many actual experimental cases, the situation is far from clear-cut with 
regard to what type of error is solely present or even largely predominant. An 
example of such a doubtful situation is discussed at the end of this article. In 
the case in which the errors of a linear process can be partitioned into known 
proportions of independent and cumulative components, a minimum-variance 
unbiased estimate of the rate of the process can readily be obtained [1]. But, 
in general, such a partitioning cannot be made, and one then usually considers 
the data as belonging to one or the other type, on the basis of generally in- 
complete information. 

One purpose of this paper is to evaluate the consequences of an erroneous 
choice concerning the nature of the errors of the process.' It will be seen that 
the degree of misinterpretation of data through selection of an erroneous as- 
sumption can be very serious. Since this paper is concerned with matters of 
principle rather than comprehensive results, it will be limited to the discussion 
of linear data; in all cases the straight line will be assumed to pass through 
vhe origin and in most cases the values of the controlled variable will be assumed 
equidistantly spaced. 

The second objective of this paper is to discuss the problem of detecting the 
correct model on the basis of an examination of the data, by studying more 
closely the differences in behavior between independent and cumulative data. 
This approach is admittedly not as direct as the more usual one of deriving 
tests of significance,’ but it has the advantage of providing a better insight into 
the peculiarities of cumulative types of data. 





1 Watson [2] and Watson and Hannan [3] examine a similar problem from a more general viewpoint. They 
examine the consequences of a wrong presumption concerning the correlation matrix of the residuals from a regres- 
sion in regard to the efficiency of the estimates and the significance points of tests of hypotheses. 

2 For tests of significance in related problems, see Moran [4] and Durbin and Watson [5]. 
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2. TWO MODELS OF LINEAR REGRESSION 

Figure 1 shows straight lines fitted to two sets of data. The construction of 

both sets is indicated in Table I. The random variables e; were taken from a 

table of random normal deviates with zero mean and unit variance. In both 

cases, the slope was arbitrarily chosen to be 3. Set A, Yj=32,;+«, is typical for 

cases to which the usual formulas of the theory of least squares are applicable 
while Set B, 


Y; = 32; +- > €j, 


j=l 


is a typical example of cumulative data. The straight line fit in both sets was 
made by calculating the slope according to the equation: 


> «¥ 


Slope = — . (1) 


Le 
This equation is legitimate for Set A but is of course not the correct least 
squares solution for Set B. 
Formally, independent and cumulative data are described by the following 
two models: 
Model A: Independent Data 
The independent variable z; and the measurement Y; are related by the 
equation: 
Y; = Bu + «& (¢=1,2,---,N) (2) 
where the ¢; are independent random variables with zero expectation and a 
common variance, i.e.: 


Var (¢;:) = constant, the same for all 7. (3) 


Model B: Cumulative Data 


Let L; represent a test interval, x;—2;., during which a measured quantity 
Y increases by an amount AY =z; which, except for random fluctuations, is 
proportional to L;. Thus: 


24 = VY; — Yun = BO; — a) te =BL;+e (f= 1,2,+-+,N5x% = 0). (4) 


The ¢; are assumed to be independent random variables with zero expecta- 
tion. It is also assumed, in accordance with the discussion in Section 1, that 
the variance of each ¢; is a function of the corresponding test interval L;: 


Var (e;) = f(L,). (5) 


The nature of this function will be discussed in Section 3. 
Model B is often encountered in a different form: 


i i i i 
Y;= 4 =8@ DY4+ De = but De. 
‘=1 ‘= 


j=l ‘=l 
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Fig. 1. Comparison of Independent and Cumulative Data; N =10. Independent Data: 
Set A of Table I. Cumulative Data: Set B of Table I. 








TABLE I 
CONSTRUCTION OF TWO SETS OF STRAIGHT LINE DATA 








Random Set A Set B 
Abscissa normal 

(x;) deviate a ae Increments Cumulated 
(€:) . 4 . 4=3+6; Y;= yee 2;=3a+ ie €j 








0. 3.63 
—1. 4.67 
—0. 8.78 

3. 13.26 
—0. 14.64 

3. 19.37 
—1. 19.47 

0. 24.85 

1. 28.60 
—0. 29.75 
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Equation (6) justifies the use of the adjective “cumulative” to describe Model 
B data. Equations (2) and (6) are formally identical, except for the error term. 

While equation (6) has here been derived from equation (4), there may be 
many situations in which the data present themselves quite naturally in the 
form of equation (6). In fact, experimental scientists are primarily interested 
in relationships, and while from a statistical viewpoint it may sometimes be 
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convenient to consider differences of successive measurements, this is seldom 
the form that is of interest to the experimenter. 

Now, Fig. 1 may lead to the impression that the difference between equations 
(2) and (6) is not reflected in the appearance of the plotted data. 

Fortunately, however, the behavior of cumulative data is quite different 
from that of independent data, and the difference becomes detectable even for 
moderate values of N. This is illustrated in Fig. 2 which is the analogue of Fig. 1, 
for N=50. The appearance of both figures will be entirely explained by the 
analysis given in this paper. 


8. LEAST SQUARES THEORY FOR MODEL B DATA 


It can be shown [6] that the assumptions of independence of the ¢, for nor- 
overlapping test intervals and the condition expressed by equation (5) imply 
that: 


Var (¢;)) = f(L;) = cL, where c is a positive constant. (7) 


Given a set of observed z-values, 2, 22, - - « , zy associated respectively with 
test intervals L,, Le, - - -, Ly in accordance with equations (4) and (7), the 
best unbiased linear estimator of the slope is obtained by minimizing the 
weighted sum of squares >+(1/L,)(z;—BL,)?. 

The resulting estimator is: 


8-(ds)/(d 4). (8) 


If the test intervals represented by the L; form an uninterrupted sequence, 
equation (8) becomes, in terms of cumulative data: 8= Yw/zw. Consequently, 
in terms of cumulative data, the best fit in this case is simply the line passing 
through the origin and the last point. In view of these results it may appear 
as though all the intermediate measurements were a waste of effort. Actually, 
these additional measurements serve two extremely useful purposes: (1) they 
usually provide additional evidence for the presumed linearity of the process; 
(2) they are indispensable for the estimation of the standard error of the esti- 
mated slope: 


N 
Var (8) = ¢/ 2) Ly. (9) 
j=l 
If the quantity ¢ is not known, the variance of § is estimated from the sample, 
by means of the following equation, in which the symbol 7 represents an un- 
biased estimate of the variance: 


1 1 1 
us sates . my 2 
V8) EL Wri be 1, AL,)*. (10) 
Equation (10) involves all measurements despite the fact that the estimation 
of the slope necessitates only the last, or the first and the last. 
It will be shown furthermore that if doubts exist concerning the cumulative 
nature of the data, they can be resolved only by a study of the behavior of all 
of the measurements. 
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Fia. 2. Comparison of Independent and Cumulative Data; N =50. 


Equations (8) and (9) deserve some attention from the viewpoint of experi- 
ment design. Consider the simplest case: L;=1 for all 7. Then, equations (8) 
and (9) become respectively 8=Yy/N and V(é) =c/N. If, instead of making 
all measurements on the same specimen, a different specimen were used for 
each measurement, the most efficient slope estimator would be 8 = ( XY) /( 2) 
=2>°>Y/N(N+1) and its variance would be c/( ¥oz) =2c/N(N +1). Thus, 
the standard error of the slope would be decreased by the factor \/(N+1)/2 
which, even for moderately large values of N, constitutes a considerable im- 
provement in precision. 


4. MISINTERPRETATIONS OF MODEL B DATA 


When data that are essentially of the Model B type are not recognized as 
such, and treated as if Model A applied, misinterpretation results. The follow- 
ing notation will be helpful in a quantitative analysis of the situation. 

Any sample estimate obtained by the “correct” treatment (in this case, 
recognizing the data correctly as belonging to Model B) is denoted by a 
caret (~). 

Any sample estimate obtained by the “incorrect” treatment (in this case, 
considering Model B data erroneously as belonging to Model A) is denoted by 
a tilde (~). Thus, 8=()>-2¥)/(>-2"). 

We seek the answers to three questions: 

(a) Is 8 an unbiased estimator of the slope 6? 
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(b) What is the standard error of 8 and how does it compare to that of A? 
(c) What is the standard error of 8 estimated to be (in the “incorrect” 
treatment); i.e., what is the expression for V(8)? 
With regard to (a), it is easily shown that the estimator 8 is unbiased. 
Answers to (b) and (c) depend on the magnitude and the spacing of the in- 
dependent variable z. In this paper we only deal with the case: 2,=1; 
%g=2; +--+ ;2~=N. Under this assumption, it can be shown that: 


- 6 2N?+2N+1 
V(6) = —- “€ 
5 N(N + 1)(2N + 1) 
where c is the constant occurring in (7) and 


- 6 >> d? 
8) = WoDNW+DQNED se i 





(11) 





where the “residual” d; (the vertical deviation of the experimental point from 
the fitted line) is defined as: 


d; = Y; — Bx. (13) 

The complete answer to (b) is now readily obtained by comparing (11) and 
(9), remembering that under our assumption regarding the z’s, all L;=1. Thus: 

V6) 6 2N*+2N+1 
V(é) 5 (N+1)(2N +1) 

For N = 10, this ratio= 1.148; for N = 100, it is 1.194, and for N = ~, it tends 
to 1.200. Thus, the “incorrect” estimate 8 is somewhat less efficient than the 
“correct” estimate 8. Its asymptotic efficiency = 1/1.2 =83.3 per cent. 

So far, the “incorrect” method does not fare too badly, since it yields an un- 
biased, and only slightly inefficient estimate of the slope. Real misinterpreta- 
tion arises, however, when the “incorrect” method is carried to the point of 
estimating the standard error of the slope. Indeed, it can be shown that the 
average value of the (incorrectly) estimated variance of the slope is: 

3(N + 2) 
+ “Cc . 
5N(N + 1)(2N + 1) 





(14) 





E[?()] = (15) 


Comparing this value to the true variance of the estimated slope, given by 
(11), we obtain: 


E[V()] e N+2 
V(B) 2(2N? + 2N + 1) 





(16) 


This relation shows that, on the average, this “incorrect” treatment will 
vastly underestimate the standard error of the slope. For example, for N = 10, 
the factor of underestimation is 6.1 and for N=100 it is 19.9. A very good 
approximation to this factor is 2\/N, even for N as small as 10. 

In Fig. 1, if both sets are treated as Model A data, the estimated standard 
error of the slope has an expected value of 0.051 for the independent data and 
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0.056 for the cumulative data. Actually, for the latter data even the most 
efficient slope estimator has an expected standard error of 1/+/10=0.82. 


5, MISINTERPRETATIONS OF MODEL A DATA 


The correct least squares estimator of the slope for Model A data is of course 
given by equation (1). If, through an erroneous assumption regarding the 
model, the analysis is based on differences of successive observations, the slope 
will be estimated by: 


= Yy 
=—.- 17 
B y (17) 
This estimator is unbiased. Its true variance is: 
ot o 
V(g) = ye 


Since the variance of the correct estimator, for the case 7;=17 is: 


60? . 
N(N + 1)(2N +1) 





v(8) = 


We have 
VG) (N+D)QN+)_ 
vis) 6N 





(20) 


Thus, the slope estimator 8 is very inefficient and its efficiency decreases as N 
increases. If not only the slope estimator, but also its variance estimate are 
based on the erroneous model, the expected value of this variance estimate is: 


aN Se 1 
E[?®] = a - (21) 


2 
Comparing (21) with (18), it is seen that the incorrect estimate of the standard 
error of the slope is large by a factor of the order of \/2N. 
It is clear from the preceding results that treatment of Model A data by the 
difference method leads to a very poor estimate of the slope and to an even 
worse estimate of its precision (in the conservative direction). 


6. BEHAVIOR OF RESIDUALS IN MODEL A AND B DATA 


Reviewing the results of Sections 4 and 5, it may be concluded that insofar 
as one wishes to obtain merely an estimate of the slope of a linear process, one 
cannot fare too badly by using the ordinary least squares formula (equation 1). 
Even in the extreme case of purely cumulative errors, this estimate is not only 
unbiased but rather efficient. The use of the difference method, on the other 
hand, gives very inefficient estimates if the data are largely of the independent 
type. 

The estimation of the precision of the estimated slope, however, is a much 
more delicate question and in general terms, one will very greatly overestimate 
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the precision of cumulative data by applying to them the usual least squares 
formulas. 

It would be very helpful if one could infer from the data themselves whether 
they are predominantly of one or the other type. The following discussion deals 
with this problem. 

Let us assume that observed y-values are plotted for N equidistant x-values, 
%=1, %=2,-- +, ta=N, and that a straight line has been fitted by equation 
(1), i.e., assuming Model A. We now ask whether the scatter of the experimen- 
tal points about the fitted line can possibly betray the true nature of the under- 
lying model. This scatter is characterized by the behavior of the residuals and 
we can study the latter by examining: (1) the magnitude of the residual at each 
point (each z-value), and (2) the relation of each residual to the preceding (or 
following) one. Since, on the average, each residual is zero, its magnitude in 
absolute value is best characterized by its standard error. As to the relation 
between consecutive residuals, if the errors affecting the data are assumed 
normally distributed, the residuals will, in either model, have a joint multi- 
variate normal distribution, and in this case an effective measure of relation- 
ship will be given by the correlation coefficient. A positive coefficient of correla- 
tion between consecutive residuals indicates a tendency for consecutive points 
to be on the same side with respect to the fitted line. A negative coefficient 
indicates the likelihood of a “crossing” of the line. If p;, ;,: represents the corre- 
lation coefficient between the jth and (j-+1)st residual, then the probability, 
P;, that these consecutive residuals be on opposite sides of the fitted line can be 
shown (see Appendix) to be: 

1 1 


Pj = — — — arcsin p;,34: (22) 
2 r 


The expected number of “crossings” of the fitted line is then given by 


N-1 
> Pi. 
gel 
The behavior of the residuals in the two models can now be inferred from 
the following formulas, the derivation of which is outlined in the appendix: 


Model A 
Standard error of residuals: 


ga, = o4/ 1 - ” . 
Correlation of successive residuals: 

iG + 1) 
V[D2 — 7) 2? -— G+ 1?) 
where >>2?=4N(N+1)(2N +1). 





51 = — 
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Model B 
Standard error of residuals: 





nvasvlog 2 BON F GN FD) 
“ NIN +1QN+1)?  5N(N+1)QN +1) 


Correlation of successive residuals: 


[1078 + 20j2 — (18N? + 18N — 11)j 
j +(10N* — 3N? — 13N + 1)]? 
iin = + : 


jt. DIG + YD 








Pit i] » (25) 








where: 
S(k) = 10k* — 2(38N + 1)(8N + 2)k + 5N(N + 1)(2N + 1). 
From these formu!as the following conclusions can be drawn: 


(i) Model A 


The residuals have standard errors that are slightly smaller than the stand- 
ard deviation of measurement c. The standard error is smaller for residuals 
at large x than for those corresponding to small z-values. For sets involving a 
large number of points (large NV), the difference between the standard error of 
the residual and o is very small, even for large z. 

Consecutive residuals are only slightly correlated. Their correlation is al- 
ways negative. For large N, the correlation becomes very small for all z-values. 


(ii) Model B 


The standerd errors of the residuals may differ considerably from 1c, the 
standard deviation of the process per unit-interval of x (ef. equation (7)). In 
fact, the standard errors of the residuals corresponding to gradually increasing 
z-values will reveal the following trend. For the first third of the way, they 
gradually increase; from there on, and for a length approximately one-half the 
total range of z-values they decrease gradually; finally, they increase rapidly 
for the remainder of the points. For large N (number of points), the standard 
errors of all residuals are considerably larger than 1c, and, in fact, they tend to 
co as N increases indefinitely, for any x for which z/N is not infinitely small. 

The correlation between consecutive residuals is positive and appreciable. 
Even for N as low as 20, the lowest value of the correlation coefficient of con- 
secutive residuals is +0.56, and the average coefficient for all 19 consecutive 
pairs is +0.74. For situations involving a large number of points (large NV), the 
behavior of the correlation coefficient of successive residuals is as follows: for 
fixed low values of z, the coefficient is close to z/(z+1). For values of x for 
which z/N remains finite, the correlation coefficient approaches +1, as N in- 
creases indefinitely. 

Summarizing these results, it can be stated, in rather non-rigorous language, 
that for large N, the experimental points in Model A data tend to be scattered 
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at random above and below the fitted line. The standard error of the residuals 
is always less than the standard devia‘ion of measurement, c, but approaches 
o rapidly as N tends to infinity. In Model B, on the other hand, assuming again 
N to be large, the residuals tend to be extremely large as compared to the 
standard deviation of the process per unit-interval, and the experimental points 
tend to remain on one side of the fitted line for long sequences of points. Thus, 
for large N, it should be relatively easy to infer the underlying model from the 
behavior of the residuals. 


7. DETECTION OF THE UNDERLYING MODEL 


The preceding results strongly suggest that a useful criterion for the distinc- 
tion between the two models might be found in the number of times the re- 
siduals, taken in sequence, change their sign, in other words, in the number of 
crossings of the fitted line. 

An empirical study of the distribution of the number of crossings for various 
values of N was made by means of a sampling experiment using the National 
Bureau of Standards computer, SEAC, both to generate the Model B data and 
to analyze them. The results are summarized in Table II. If it is remembered 
that, in the case of Model A data, the probability of a crossing is approxi- 
mately 4 at each z value, and that the expected number of crossings is therefore 
close to (V—1)/2, it is seen that for values of N exceeding, say 30 or 40, the 
nature of the data can be inferred with sufficient confidence from the number 
of crossings. In fact, the expected number of crossings, for Model B data, 
seems to be very closely approximated by \/N, at least for N up to 200. 

When N is of the order of 30 or more, the distinction between the two models 
is relatively easy, because of the noticeable difference in the behavior of the 
residuals. Thus, in Fig. 2 (VN = 50), the number of crossings is 26 for the inde- 
pendent data, and only 7 for the cumulative data. On the other hand, for values 
of N less than 20, the behavior of the data in both models is sufficiently alike 
to make it almost impossible to differentiate between them on the basis of a 
single set of data. This explains the failure of the two graphs in Fig. 1 to reflect 
their corresponding models. However, as will be shown in the following section, 
the behavior of the data can lead to valuable inferences about the underlying 
model even when N is as low as 6, provided that a sufficiently large number of 
sets of data are available. 


8. APPLICATION TO AN EXPERIMENT IN PHYSICAL OPTICS 


Table III summarizes the results obtained in an experiment carried out by 
Waxler and Napolitano [7] on the double refraction produced in optical glasses 
by the application of stress. The relation underlying this phenomenon is a 
simple proportional relationship between the optical path difference, r, and 
the stress, ¢: 


r= Ct (28) 


where C is a constant characteristic of the type of glass under investigation. The 
object of the investigation was the evaluation of the constant C for 27 different 
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TABLE II 


NUMBER OF CROSSINGS OF FITTED LINE FOR MODEL B DATA 
Results of a Sampling Experiment 








b ings* 
Number of points| Number of sets Ctnsetes aaeiter  enentags 


(N) examined Average Std. dev. Lowest Highest 








10 
15 
20 
25 


.73 
48 
17 
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* A crossing is a change of sign between 


TABLE III 


RESULTS OF A STUDY OF STRESS EFFECTS IN OPTICAL GLASSES 
Number of Crossings for 27 Sets of 6 Points Each 








L ‘on* of Observed Expected number of crossings 
ocation*® o number of 


crossings crossings Model A Model B 











Averaget 
Std. dev. 








* The notation (i, +1) in this column signifies the passage from the ith to the (i+1)st residual. 
+ Average number of crossings per set. 


types of optical glass. The stress, t, was varied by increasing the load from zero 
to 300 pounds in six consecutive increments of 50 pounds. From our present 
viewpoint, the important feature was the use of a single specimen for each type 
of glass, so that the experiment could very well give rise to cumulative errors. 

The average number of crossings observed in the 27 sets was 2.11 with a 
standard deviation, from set to set, of 1.01. Thus, the standard error of the aver- 
age is 1.01/./27 =0.19, which indicates good agreement of the average with the 
theoretical value, 1.86, based on the cumulative model. The breakdown in the 
table according to location of crossing is even more revealing than the over-all 
average, and substantiates the conclusion that the cumulative error in these 
data overshadows largely any random measureme:it errors. 
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APPENDIX 
Derivation of Equations (22) through (27) 
(i) Equation (22) is based on the identity 


T 
— — arcta 


J “f " go (eattuetes! dudy = E —_ Via = 0 a (Al) 
V4ac — b? 





Setting the quadratic form equal to 


f fH v)dudy = hd + ts arcsin p (A2) 
o 40 4 2 


where f(u, v) is the probability density function of the general bivariate normal 
distribution with correlation p. Expression (A2) represents the probability that 
u and v are both positive and, for reasons of symmetry, also the probability 
that they are both negative. Consequently, the probability of a change of sign 
is 
per (. ve 1 ) 1 1 
- —arcsin p} = — — — aresin p. 
Pal: commit BRE ae 


(ii) Equations (23) and (24) are derived as follows: Assuming the variance 
of a measurement in Model A to be o?, the relation 8 =( >> xY)/( >z") leads to 


Var (8) = o/)>> 2? 
and 
Cov (Yi, ) = x0?/(> 2’). 
Since 
Cov (d;, dj) = Cov (Y; — Bai, Y; — Bx;) = Cov (Y¥;, Yj) — 2; Cov (Yj, 8) 
— x, Cov (Yi, 8) + xa; Var (8), 
it follows that: 
LiL; 


Cov (d;, d;) = Cov (Y;, Y;) — a hd 
For i=j, we have 

Cov (Y;, Y;) = Var (Y,) = o?; 
hence: 


Var (d;) = o°[1 — (2,*/ 2) 2*)]. 
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For 1+), we have 
Cov (¥,, ¥,) = 0; 
hence: 
LyX; 
g 
Ls 


Equations (23) and (24) follow at once from (A4) and (A5) by making x, =k. 
(iii) Equations (25) and (26) are derived as follows: In equation (8) make 
L;=1, and express # in terms of the Y;. This leads to the linear relation 


B = pe wil; (A6) 


Cov (d;, dj) = — 2, (A5) 


where 
_ BIN + 0)(N + 1-1) 
“* N(N + DQN +1) 
Recalling that Var (z;)=c, it follows from (A6) and (A7) that: 


6 2N?+2N+1 
—_— -¢ 
5 N(N + 1)(2N +1) 


(A7) 








Var (8) = and Cov (Yi, 8) =c > op. 
k=l 


Since 
Cov (di, dj) = Cov (¥; — Bi, Y, — B;) = Cov (Yi, Y;) — i Cov (¥;, B) 
— j Cov (Yi, B) + ij Var (8) 
it follows that: 





ty . 18N?+18N+4 
ule +3 - 5 ] 

i, 2j) = Y,, Y; ‘Cc. A8& 
Cov (d;, d;) = Cov ( a+ WWE DON ED c. (A8) 





For i=j, we have 
Cov (Yi, Y;) = Var (Y,) = ej 
and (A8) becomes: 


18N? + 18N + 4 
5 p+t+N(N + 1IQN + 1)-7 


Var (d;) = + ¢€. 
N(N + 1)(2N + 1) 





274 _ 





For i>j we have 


és i 
Cov (Y;, Y;) = cov | 2&2. | = >> Var(z) = ej 
1 1 1 


and (A8) becomes: 





AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1957 


we | 





MW +I)QN +1) +i] R474 : 


Cov (d;, d;) = *C. (A10) 
N(N+-1)(2N +1) 


Equation (A9) leads at once to equation (25), and by making i=j+1 in (A10) 
one obtains equation (26), with f(k) defined as in equation (27). 
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ESTIMATING THE LOGISTIC CURVE 


H. SILvERsTONE 
University of Otago, New Zealand 


In estimating the parameters of the logistic curve when the response 
at any given dose is binomial, the method of “maximum likelihood” 
and the method of “minimum logit chi-square” have both been sug- 
gested. Computational difficulties formerly associated with the maxi- 
mum likelihood solution have now largely disappeared. In a comparison 
of the two methods it is shown that the maximum likelihood estimates, 
but not the minimum logit chi-square estimates, are sufficient estima- 
tors for the parameters. Particular attention is paid to samples where 
all but one of the responses are zero or 100 per cent. Also, the effects 
of using the “2n-rule” for handling the minimum logit chi-square esti- 
mation problem for cases of zero or 100 per cent kill in any class are 
examined. The two sets of estimators are also surveyed from the view- 
point of consistency and some remarks made on the criterion of mini- 
mum mean square error estimators when sufficient estimators exist. 


1, INTRODUCTION 


N the treatment of binomial response data, the logistic curve has been sug- 
I gested as an alternative to the integrated normal curve. In practical fitting 
problems it has been found by a number of investigators that the logistic curve 
appears to fit appropriate data as well as does the integrated normal curve. 
Controversial problems have, however, been raised, particularly by Berkson 
[2], [3], concerning the best method of estimating the parameters. Berkson 
has proposed that his method of “minimum logit chi-square” should be used 
in preference to maximum likelihocd. Much of the attraction of the former 
method has arisen from its simplicity. However, the computational difficulties 
associated with the maximum likelihood estimates have now largely dis- 
appeared. (See, for example, a very useful paper by Berkson [4].) In addition, 
as has been pointed out by Anscombe [1], there are cases where it may not be 
possible to avoid an iterative solution for the minimum logit chi-square esti- 
mates. With computational questions largely out of the way, it would appear 
to be an appropriate time to examine the two methods by more fundamental 
criteria. 

2. MAXIMUM LIKELIHOOD AND MiNIMUM LOGIT CHI-SQUARE SOLUTIONS 


It is assumed that the probability of “death” at dose z; is P; and that the 
response at any given dose is binomial. The true probabilities P; are assumed 
to lie on the logistic curve 


1 


Py te we cceal 
1 aL e7 (atbzi) 


(1) 


so that the logits L;=In (P;/Q;), where Q;=1—P,, lie on the straight line 
L; =at+ B2;. (2) 
567 
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If n;=the number exposed at dose 2z;, (i=1, 2, - - -, k), and r;=n,p;=the 
number of deaths observed at dose z;, the likelihood function is 


k fn; 
II ( ) PZQe, (3) 
é=1 \T 
On using (1) we find that if A=In (likelihood), then 
A=T7T,Ri+T.AR+T+R 
where Ti =a, T.=8, T= yin In Q:, R= Yr, R= Yori, and 


R= Yin ("). 


r% 


oP; dQ; ; 8Q; 
— = —-—“=PQ, and = PQsts, 
da da fe, ap ap Qe 


it is readily found that 


OA 
oe oP Dd nips — P,) 
0a 
(5) 


OA 
—_ = ®n- s < — P; “ 
ap 2. a(p ) 


It follows that (@A/da, 9A/d8) is a one-one function of (>onp;, Sonzip,), 
so that the latter is a minimal sufficient statistic for (a, 8). In case 8 is known 
and only @ to be estimated, >>n,p; is minimal sufficient for a. 

The maximum likelihood equations for the estimate (@, 8) are 

yD n(pi — pi) = 0 

> na(ps — pi) = 0 
where ;=1/{1+e~“+4+)}. (In the sequel, the symbols ~ and ~, respectively, 
will be used to denote estimates obtained by maximum likelihood and by 
minimum logit chi-square.) No explicit solution of (6) exists, and iterative 
methods are employed (see later). 

Defining logit chi-square as > np.qi(li—L,)?, where 1;=In (pi/gqi), and 
minimising with respect to a and 8, we have, for & and £. 

ie nipaqil: — 1) =0 
> npiQqiri(l, — 1.) =0 
which can be more conveniently written as 


C: Nepiqi)& + (2 nepigits)B ~ Sy npiqils 
(x NiPiQiti)& + +e npqite)B = »¥ NiPiQitili. 


(6) 


(7) 
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In [3] Berkson says that whereas the minimum logit chi-square estimate 
(&, 8) is sufficient for (a, 8), the same is not true of the maximum likelihood 
estimate (@, 8). The test used is as follows: A statistic 7 is sufficient for a pa- 
rameter @ if all samples yielding the same value of T also have the same value 
of A/00. In the present case (@A/da, A/d8) may be replaced by the minimal 
sufficient pair (> npi, >ongpia,). (See, for example, Fisher [7].) Conversely, 
if it is possible to establish that, for a particular estimate (a*, 6*), ( >on, 
>-7.p.2,) is not a single-valued function of (a*, 6*), then the latter is not suf- 
ficient for (a, 8). 

When no adjustments are made in the case of zero or one hundred per cent 
kills, certain samples lead to infinite values for one or both of the maximum 
likelihood estimates, and it is in respect of such samples that Berkson claims 
that sufficiency breaks down for the maximum likelihood solution. 


8. SUFFICIENCY OF THE MAXIMUM LIKELIHOOD SOLUTION 
For the maximum likelihood estimates we solve the equations 


9(4, B) = nds = DY np 


£4, B) = Dindbar = LD npai. os 
The Jacobian a(7, ¢)/0(@, 8) is equal to 
Lemibids =D medias (10) 
Linbigias Li nidbda? 
and this reduces to 
De nan spibigidi(as — 2;)?. (11) 


>j 


Except when k—1 of the /; are 0 or 1, the Jacobian is positive. By consider- 
ing the signs and the continuity of the derivatives which form the elements of 
(10), it can be shown that, apart from the exceptional cases already mentioned, 
(&, 8) is a one-one function of (7, ¢) everywhere. From (9) it thus follows that 
(&, 8) is a one-one function of the minimal-sufficient pair (> np, inp.z,), 
that is, is sufficient for (a, 8) over the whole sample space except, possibly, 
over that sub-set of points of the sample space for which k—1 of the estimates 
p; are either 0 or 1. 

Berkson [3] has discussed the case where there are 3 equally spaced doses, 
taken as z;= —1, 0, 1, respectively, and the same number (n,=10) are exposed 
at each dose. Denoting a particular sample by [p:, pz, ps], where the entries 
are the observed proportions killed, the six classes of sample leading to in- 
finite estimates for either or both of @ and / are 


(i) [p:,0,0] ii) [0, 0, ps] (iti) [0, ps, 1) 
(iv) [m, 1,1] — (v) [1,1, ps] (wi) [1, ps, 0] 


where, in the first plave, the p; may be taken as distinct from 0 or 1. 
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We now introduce the degenerate curves which are members of the system 
of logistic curves defined in equation (1) and which provide solutions for the 
above cases. 

Consider the set of logistic curves which pass through a given point. This 
point may, without loss of generality, be taken as (0, po), where, in the first 
place, we take po as different from 0 or 1. Any curve in this set is then fixed 
uniquely by a second point on it, since we are dealing with a two-parameter 
system. Take this second point, again without loss of generality, as (1, p,). 
Then for any point (x, p) on this curve it is readily shown that 


21 z 
ri/eF Oy 
qo Pi 
where go=1—po, @=1—pr. 

Keeping po fixed, let p,—1. When x>0, P—1; when <0, P-0; while when 
x=0, P=po. Similarly, if po is fixed while p:,—0, then when x>0, P->0; when 
x<0, P—1; while when r=0, P= po. 

It follows that there are degenerate curves of the system which have formulas 
of the types (i) P=0 for x<a, P= po for r=29, P=1 for x>2; and (ii) P=1 
for r<2o, P=po for r=2, P=0 for >2o. 

It also follows that degenerate eurves of the system can be made to pass 
through each of the six sample points listed above. This means that the maxi- 
mum likelihood equations }>$;= }-p; and >-p;= >op.z; can be given the 
solutions £;=p;, t=1, 2, 3, in each of the six cases in question and that de- 
generate curves of the system can be drawn through the sample points. For 
example, the solution for sample (iv) is 


P=0 for tx<-1 
=p, for t=-1 (13) 
=1 for z>-1. 


In the case where a sufficient statistic exists, maximum likelihood claims to 
yield the whole of the information contained in the sample. A moment’s re- 
flection will show that, in the case of the example just given, it would be un- 
reasonable to suppose that the sample could yield any further information than 
that provided by the maximum likelihood solution. The fault, if any, lies not 
in the nature of that solution but rather in the choice of doses made in the 
experiment (though it must be admitted that this type of thing is likely to 
occur even in the best run experiments). 

We note that each distinct sample in the six classes has a distinct value of 
(>-p, Spz) and also that each distinct sample leads to a different fitted curve. 
This fits in with the geometrical requirements of the notion of sufficiency. 

Finally, we have to consider cases where the p; in classes (i) to (vi) may take 
values 1 or 0. This leads to six special samples 


(vii) [1,0,0) (viii) [0,1,1] — x) [1 1, 1J 
(x) [1, 1, 0] (xi) [0,0,1] (xii) [0, 0, 0] 
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To show that degenerate curves of the system may be passed through these 
sets of values, consider the limit of the right member of (12) as po->0 and p:—1. 
When z«>1, P-1. When x<0, P-0. When 0<2x<1, P is indeterminate. 
Hence, in general, we have degenerate curves of the form P=0 for <2, P=1 
for x>2, while P is indeterminate for z»<x<2;. Similarly, there are curves 
P=1 for r<a, P=0 for x>%, while P is indeterminate for r»><2<2%. 

The remaining cases are given when po and p; both tend to 1, and when po 
and p, both tend to zero. In the former case, for x>1 and x<0 the curve is 
indeterminate, while for 0<x<1, P=1. Similarly, if po and p; both tend to 
zero the curve is given by P=0 for 0<x<1 while P is indeterminate outside 
this range. Generally, this means that the curves P=1 or P=0 for m~<2<m, 
P indeterminate elsewhere, are degenerate curves of the system. 

It follows that the solution of the maximum likelihood equations for samples 
(vii)—(xii) may be taken as $;= p,;, i=1, 2, 3, and the corresponding degenerate 
curves drawn through the sample points. For example, the solution for sample 
(viii) is 

=0 for t< -1; 
=1 for «>0; while (14) 


p is indeterminate for — 1 < x < 0. 


Again the solution is perfectly reasonable, and desirable, since it is obvious 
that a sample such as [0, 1, 1] can provide no information about the nature of 


the true response curve between x= —1 and x=0. The drug appears ineffective 
at doses not exceeding —1 and completely lethal at doses not less than zero on 
the z-scale in question. 

All six final cases lead to distinct solutions which are themselves distinct 
from all others previously obtained. The minimal sufficient statistics are also 
ail mutualiy distinct. 

Thus the sufficiency of (@, 8) in all cases has been established. 


4. NON-SUFFICIENCY OF THE MINIMUM LOGIT CHI-SQUARE ESTIMATE 


Since pq In (p/q) tends to zero when p tends to 1 and when p tends to 0, it 
is obvious that (@, 8) cannot be sufficient for (a, 8) in terms of the original 
definition, since an observed p; of 0 can be replaced by an observed p; of 1 in 
equations (7) without affecting the solution. Such a substitution would, how- 
ever, alter the value of the minimal sufficient statistic. In addition, if all but 
one of the observed p; are either 0 or 1, the equations become indeterminate. 
It is in order to avoid the latter type of difficulty that Berkson has introduced 
the so-called “2n-rule” according to which an observed p; of zero is replaced by 
p;=1/(2n,;) and an observed p; of unity is replaced by pj=1—1/(2n)). 

Berkson’s claim that (4, 8) is sufficient is based on computations of the 1331 
values of (&, 8) for the case of 3 equally spaced doses, with 10 exposed at each 
dose. Using the 2”-rule where necessary, he found that all 1331 samples gave 
different values of (&, 8), while in the case where 8 was fixed at a particular 
value (0.84370 . . . ) he found that whenever two samples gave the same value 
of & they also gave the same value of Doni. 
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However, this apparent sufficiency is merely a reflection of the particular 
sample size used, and the property does not necessarily hold for all other values 
of the n,. 

Taking the z; as —1, 0, 1, respectively, consider any sample of the form 
[m, 1—m, rr]. Equations (8) yield the solution a4=} In (p:/m), 8 =0. Next 
consider any sample of the form |p2, 22, 2]. The solution here is 4=1n (p2/q:), 
8 =0. Suppose now that (p:/q) = (p2/q:)*. Both samples then give &=1n (ps/q:), 
8 =0. Eowever, the minimal sufficient statistics are, respectively, (1+, 0) 
and (3p2, 0), and these are in general different. 

As a numerical example, take n;=9, (i=1, 2, 3), and observations [8, 1, 8] 
and [6, 6, 6], respectively. Both samples give (4, 8) as (In 2, 0), but the respec- 
tive minimal sufficient statistics take values (17/9, 0) and (2, 0), respectively. 
The maximum likelihood estimates are (In'1.7, 0) and (In 2, 0), respectively. 

It is clear, therefore, that (>-np., >-np.a,) is not a single-valued function 
of (&, 8), and that the latter is not sufficient for (a, 8). 


5. UNIQUENESS AND MAXIMUM LIKELIHOOD 


In [3], on the basis of an examination of his total population of 1331 possible 
cases, Berkson observes that “the maximum likelihood estimate of a with 8 
known is the same for each sample in a sufficiency group,” a sufficiency group 
being defined as the group of all samples having the same value of the minimal 
sufficient statistic >} n,p;. This, as may be seen from the foregoing discussion, 
is all that is required by the so-called “uniqueness theorem” for maximum like- 
lihood estimates. Unfortunately, some confusion has been caused by incorrect 
statements of this theorem. (See, for example, Kendall [10].) The incorrect 
statement of the theorem makes it appear that if a sufficient statistic exists it 
is either the maximum likelihood estimate or a function of the maximum like- 
lihood estimate. Obviously, it is the converse of this which is true. If T is 
sufficient for @ we may write the likelihood function in the form ®(z; @) 
=9(T, @)-h(x), as usual. The maximum likelihood solution is obtained by using 


ts) 
re In 9(T, 0) = 0 (15) 


to give 9 as a single-valued function of T, but not necessarily as a one-one 
function of 7. The maximum likelihood estimate is a function of any sufficient 
statistic, but the converse is not necessarily true. This point has subsequently 
been noted by Berkson [4]. 


6. TRANSFORMATIONS OF PARAMETERS 


In a footnote to [3], Berkson remarks that the nature of the maximum likeli- 
hood concept would lead us to believe that “if a sample S is very frequent with 
some particular value of @, such a sample would characteristically yield @ by 
maximum likelihood, or estimates near 6.” He then points out that if the true 
values of a and @ are a=0 and 6=4.595, samples of the form [0, p, 1] have an 
a priori probability of about .8 of occurring; that is, 80 per cent of solutions 
are expected to give 8= , whereas the “true” value is 4.595. 
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It is usually held to be desirable that in estimating a parameter @ by any 
given principle of estimation, we should at the same time be estimating all one- 
one functions of 6, by the same principle. However, if we use a transformation 
¢(0) which has a singularity at, say, @=@, then situations such as that noted 
by Berkson will occur if the subset of sample points yielding estimates 6* = 6 
has a probability measure which is at our disposal. Indeed, by suitably selecting 
the “true” value of @, or by postulating a suitable prior distribution of 6, we 
may be able to produce estimates 6* in the neighbourhood of 6) with any pre- 
assigned relative frequency. With ¢(6*) unbounded in the neighbourhood of 6 
we may thus be able to produce large discrepancies between $(8) and ¢(6*) 
with the given relative frequency. 

For, instance, in the case of the ordinary binomial distribution with parame- 
ter @, both maximum likelihood and minimum chi-square methods of estimation 
yield 6*=r/n as the estimate of 6, and consequently n/r as the estimate of 
(0) =1/0 wherever ¢(@) is defined. By suitably selecting the true value of @ we 
may produce samples where r=0 with any pre-assigned expected relative fre- 
quency. If @ is constrained to lie in the half-open interval 0<@<1 for the pur- 
poses of the transformation, while 6* is allowed to take values in the closed 
interval 0 <6* <1, then ¢* = © appears with the pre-assigned relative frequency 
while ¢ remains finite. 

If the logit transformation assumes that a+z is not infinite for any finite 
value of z, that is, that the “true” value of P cannot be 0 or 1 for any finite 
dose, then the paradox of infinite “maximum likelihood” estimates of parame- 
ters constrained to be finite in a non-closed domain must remain. However, if 
one is prepared to admit the degenerate system of curves described above (thus 
closing the domain of (a, 8)), the apparent contradiction disappears, although 
it will still be possible to select a priori “true” values of (a, 8) which will make 
the maximum likelihood solution appear “unlikely.” However, alternative pro- 
cedures, such as truncating the observations so as to exclude the awkward 
cases, appear to be attended by certain disadvantages which may turn out to 
be actually more serious than the disadvantages which the amended method is 
designed to overcome. In addition, there is often the logical objection that the 
amended procedures used for the awkward samples involve a departure from 
the original principle of estimation, be it maximum likelihood, minimum chi- 
square or any other principle. 


7. CONSISTENCY OF THE ESTIMATES 


The adoption of an expedient such as the “2n-rule” may have the effect (as 
it does in the present case) of truncating the ranges of the estimates. For in- 
stance, considering the problem as being one of fitting the straight line L=a+ zx 
to the points l,, l, ls, it is seen that in the case where 8 is given, we certainly 
have 


— B-—In(2n-—1) SC &< In (Qn -—1) +8 (16) 


since the maximum and minimum values allowed to the 1; by the 2n-rule are, 
respectively, In (2n—1) and —In (2n—1). 
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The effect of this is that the minimum logit chi-square estimates are not 
consistent in Fisher’s sense for finite samples (see [8] and [9]) when the 2n-rule 
is used, since one finds, on substituting the expected values of the frequencies 
in the formulae for & and 3 that the solutions are different from a and 8. How- 
ever, when the number tested at each dose tends to infinity, the estimates @ 
and @ tend in the ordinary sense, and hence stochastically, to the true values 
of the parameters. Hence we have consistency in the sense of convergence in 
probability, but not Fisher-consistency. When the 2n-rule is not used, we find 
consistency in both senses. 

The maximum likelihood estimates are consistent in both senses. If one sub- 
stitutes the expected frequencies P; for the observed frequencies p; in equations 
(6) one sees that the maximum likelihood solution yields the true values of a 
and 8. This solution is given by /;=P;, (¢=1, 2, 3), since (i) the $; so defined 
lie on a curve of the system; (ii) they satisfy the maximum likelihood equations, 
(iii) the solution is unique; (iv) the inversion from the #; to (@, 8) is unique and 
must therefore yield the true values (a, 8). Since this is true for all values of 
the n,, it is clear that Fisher-consistency implies consistency in the sense of 
convergence in probability. 

The numerical effect of the 2n-rule is also worth considering. For example, 
with 3 equally spaced doses, 10 at each dose, the sample [2, 5, 10] gives an un- 
adjusted (& 8) of (0. 1.386), while, using the 2n-rule, the estimate becomes 
(0.292, 1.923). In [3] Berkson shows that if the true values of the P, are, 
respectively, .3, .5, .7, the standard error of & is 0.392 and that of 8 is 0.518. 
Hence, the adjustment to & is somewhere in the region of three-quarters of the 
standard error of &, and that to 8 is about as large as the standard error of 
B itself. The use of the 2n-rule would thus appear to be in need of greater 
justification than that so far provided by its author. 


8. ESTIMATING THE L.D. 50 


The unsatisfactory nature of the weights n,p.q; used in fitting by minimum 
logit chi-square when observed frequencies are encountered in proximity to 
zero or 100 per cent may be shown by a simple example. Consider the following 
table of estimates of the L.D. 50 obtained by minimum logit chi-square and 
by maximum likelihood. The doses are equally spaced at x= —1, 0, 1, respec- 
tively, and there are 40 tested at each dose. 

Commonsense dictates that the estimated L.D. 50 should show a progressive 
decrease from sample (i) to sample (iii). While this is indeed the case with the 
maximum likelihood estimate, it is not so with the minimum logit chi-square 
estimate. The effect of increasing the last observation from 38 to 39 was actually 
to decrease both & and 8, but increase 


9. ESTIMATION BY MAXIMUM LIKELIHOOD 


In the light of what has been said, it now appears that the advantage of ease 
of computation previously advanced as a compelling reason for preferring the 
use of minimum logit chi-square is outweighed by other considerations. In 
addition, as has been pointed out by Anscombe [1], a fully satisfactory method 
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TABLE I 


SOME COMPUTATIONS OF THE L.D. 50 
(n; =40, i=1, 2, 3) 








Sample Ts | m m 
| 

(i) 36 .333 -405 
(ii) 38 .315 .353 
(iii) 39 .415 .324 











of estimating a and 8 by minimising logit chi-square would appear to involve 
iteration. For a different reason, namely the lack of an explicit solution of equa- 
tions (6), iteration is also required for the maximum likelihood solution. The 
process, however, is not tedious. 

Following Finney [6], and using the known expressions for 8P;/da and 
aP;,/dB, one finds that if (a, 8) is a preliminary estimate of (a, 8), we obtain 
the corrections (A@, 48) from the equations 


(Qo npg)aa + (2) npgr)aB = Dir — Ding 
(> npgx)A& + eo” npgx?) AB = ie TZ — 2. npx. 


In deciding how best to obtain trial values of @ and 8 one might with advan- 
tage compare equations (8) anc (17), in which it is seen that the matrix of co- 
efficients on the left hand sides is of the same form. Except when a number of 
the observed relative frequencies are near 0 or 1, it would thus appear conven- 
ient from a computational point of view, to use (8) to obtain the trial values of 
é and 8, and then obtain corrections by the use of (17). Comparisons between 
the two sets of estimates can be made at the same time. 

To illustrate the degree of accuracy obtainable by a single application of (17) 
to the solution of (8), some examples were worked out for the case of 3 equally 
spaced doses at z;= —1, 0, 1, with n; constant. For the 3 samples shown in 
Table II, the differences between (&, 8) and (@, 8) increased progressively. In 
all cases, however, a single application of equations (17) proved satisfactory, 
as may be seen by comparing the values of (>>p, >-fx) and (>-p, >-pz) in 
each case. 

A further example was worked out for 5 equally spaced doses, with an equal 
number at each dose, the difference between (&, 8) and (@, 8) again being 
appreciable. The x; were taken as —2, —1, 0, 1, 2. The results are shown in 
Table III. Again, no second iteration was required. 


(17) 


10, SOME GENERAL CONSIDERATIONS 


Berkson [3] undertook calculations to show that for certain true values of 
(a, 8) the minimum logit chi-square estimates had a smaller mean square error 
than the maximum likelihood estimates, even when the samples were restricted 
to those yielding finite estimates for @ and #. However, for other true values, 
such as those lying sufficiently far outside the range of estimates permitted by 
the 2n-rule, it is doubtful whether such a claim could be upheld. In any case, 








TABLE II 


COMPARISON BETWEEN ESTIMATES BY MINIMUM LOGIT 
CHI-SQUARE (~) AND BY MAXIMUM LIKELIHOOD (+) 
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lst Sample 


2nd Sample 


3rd Sample 





Pp 
Bi 
P 
(@, B) 
(@, B) 
L.D.50: ™ 
L.D.50: m 


(Xp, Lpz) 
( D>», pz) 


- 4 8 


VY 457.771 
172. .456 .772 


(—.173, 1.386) 
(—.178, 1.396) 
.125 
.127 
(1.400, 0.600) 
(1.400, 0.600) 


x ee ck ah 
259 .396 .553 
254 .393  .553 

(—.421, .632) 

(—.484, .647) 

.666 

671 
(1.200, 0.300) 
(1.200, 0.299) 


2 8 8 


-284 .613 .862 
.283 .634 .883 


(.462, 1.386) 
(.549, 1.477) 
— .333 
— .372 
(1.800, 0.600) 
(1.800, 0.600) 














TABLE III 


MINIMUM LOGIT CHI-SQUARE AND MAXIMUM LIKELIHOOD 
COMPARED FOR A FIVE-POINT FIT 








Pi 2. .2 8 8 8 
bi .231 .409 .613 .785 .893 
.193 .582 .770 .890 

(.462, .832) 

(.332, .878) 

(2.800, 1.800) 

(2.802, 1.797) 


(&, 8) 

(@, B) 
(Xp, Xz) 
(Lb, Lz) 








arguments based on the hypothesis “if the true value of the parameter is so- 
and-so” are likely to be misleading. For instance, it is easy to show that in the 
case of the ordinary binomial distribution with parameter 6, if the number of 
trials is, say, n =3, then for all “true” values of 6 between } and 3, the estimate 
T; =} has smaller mean square error than the usual estimate T,=r/n. However, 
this could hardly be adduced as a cogent reason for using 7; in preference to 72 
when nothing is known concerning the “true” value of 0. 

When arguing from the standpoint of mean square error one should attempt 
to show either that the advantage in terms of mean square error lies with the 
minimum logit chi-square estimate for some plausible prior distribution of 
(a, 8) or that the advantage lies with that estimate uniformly, that is for all 
values of (a, 8). However, one would here be concerned with a decision problem 
in which the loss function is of type (7 —6@)?, in which case we have at our dis- 
posal certain results (see, for example, Blackwell and Girshick [5]) which tell 
us that the appropriate solution is a Bayes solution, and that a solution is not 
admissible when sufficient statistics exist unless it is.a function of those suf- 
ficient statistics. Hence, the minimum logit chi-square estimate cannot com- 
mend itself simply on the basis of its mean square error for particular true 
values of the parameters. 








ESTIMATING THE LOGISTIC CURVE 577 


On the other hand, if we have no knowledge of the prior distribution, or no 
reason for postulating one, then the method of maximum likelihood still ap- 
pears to be the only one of sufficient generality to meet the inferential problem 
of estimation, especially when sufficient statistirs exist ; and there would appear 
to be no reason, either in theory or in practice, for abandoning it in estimating 
the parameters of the logistic curve. 
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CORRIGENDA 


Readers and authors are invited to submit corrections to papers published 
in any previous issue. These will be published each year in the December issue. 


Goodman, Leo A., and Kruskal, William H., Measures or ASSOCIATION FOR 
Cross CuassiFicaTions, Vol. 49, No. 268 (December 1954), 732-64. 

Vernon Davies (Washington State) has pointed out that formula (6) on p. 
740 is wrong. There should be a radical in the denominator, so that the correct 
formula is: 








T = V[x*/vjV/(a — (8 — 1). 


Pennock, Jean L., and Jaeger, Carol M., Estimatine THe Service Lire or 
HovseHoitp Goons sy AcruartAt Meruops, Vol. 52, No. 278 (June 1957), 
175-85. 

The authors call attention to a typographical error. The formula on p. 179 
should read: 


S241 
+ 5. 
L, 








BOOK REVIEWS 


Elementary Statistical Methods, As Applied to Business and Economic Data, Revised 
Edition, William Addison Neiswanger. New York: The Macmillan Company, 1956. Pp. 
xx, 749. $6.90. 


Luioyp Savitie, Duke University 


Wm written and widely adopted, the Neiswanger text now appears in an im- 
proved and modernized edition. This fine text has been generally known and 
highly regarded since its publication in 1943; the revision is both a comprehensive 
and meticulous job of bringing the volume up to date and perfecting the presentation. 

If descriptive statistics are far to the “right” and statistical inference is far to the 
“left”, then this text remains somewhat right of center. This may only indicate that 
Neiswanger’s choice of subject matter and approach in the beginning course in sta- 
tistics is similar to the reviewer’s. Practically, however, it is difficult to conceive of 
the nonspecialist, the undergraduate major in economics or business, coping with and 
digesting an approach much further to the “left.” In the revisions described below the 
increased emphasis on the problems of inference is noted. 

One of the most appealing characteristics of the earlier edition was its style. It is 
hard to imagine how pleasant and lucid prose could have been produced in the 
chaotic atmosphere of World War II Washington. It is remarkable evidence of Neis- 
wanger’s capacity for careful and effective work in a difficult environmext—he was 
Special Assistant to the Deputy Administrator of the Office of Price Administration 
while he was completing work on the 1943 edition of this volume. 

The new edition is a thoughtful and careful revision in which the style, content, 
and emphasis of the most readable text in the field have been improved and brought 
up to date. Compression and clarification of the earlier edition have permitted the 
addition of a number of new ideas and a completely new chapter on statistical inference, 
without unduly extending the length of the revised edition. Even in sections where 
words remain unchanged for several paragraphs a small alteration will appear to 
point up a phrase or an idea. Furthermore, topic sentences have been shortened and 
sharpened. This careful editing and rewriting of passages on practically every page 
as well as the recasting of many sections has resulted in an even more lucid and read- 
able book. 

Specifically the following changes have been incorporated in the new edition: (1) 
A thoughtful treatment of the “Design of Samples” represents a substantial reworking 
of the original Chapter 4. (2) In the earlier edition statistical inference was covered 
in the second half of Chapter 10, in nine pages. This section has been expanded more 
than three times and placed as a separate chapter, 11. In addition an entirely new 
chapter has been added, 12, also on inference. The first of these treats the mean and 
the second, proportions, differences, and small samples. There is now a total of fifty- 
six pages on the topic of statistical inference. (3) A chapter summary has been added 
to Chapter 1 so that now all of the chapters have this useful pedagogical device. (4) 
One of the most attractive features of the earlier edition was the abundant use of 
actual statistical examples for which accurate sources were given. In the revised edi- 
tion these cases have been brought up to date and many new ones added. The use of 
modern, interesting examples is, of course, valuable in giving the student an idea of 
the magnitudes of economic rubrics. This is still one of the strongest features of the 
text. (5) The question and problem sections as well as the bibliographies appearing 
at the end of each chapter have been revised. 
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(6) Numerous changes relate to the Appendixes. The general Table of Contents 
no longer includes a list of the Appendixes and their page locations; this seems to be 
a minor but needless omission since two-thirds of the final page of the Table of Con- 
tents is blank. Two of the appendixes have been dropped and two added: “A Re- 
fresher on Arithmetic” containing a few drills on calculators, and “Sampling Variances 
for Three Sample Plans Mentioned in Chapter 4” have been added. Dropped are a 
listing of formulas and three proofs. Finally two tables have been added, one of them, 
the values of t, excerpted from R. A. Fisher, and the other, random numbers from 
the I.C.C. 

In both of the editions the allocation of space seems logical. The first two chapters 
in each discuss the use of statistical methods in general and the misuse of them in 
particular. Both are helpful in orienting the student, especially the early introduction 
of the recurrent theme, allowable error. Thorough treatment is properly given to time 
series, for the text is designed for the consumer of statistics in the fields of economics 
and business. Throughout, the emphasis on a verbal as well as a mathematical under- 
standing of statistical concepts helps to orient the nonmathematically trained student 
in the subject matter. 

This review is extremely favorable and in a sense the major criticism is also a com- 
pliment. In many ways the book is too good and too complete! It makes vigorous 
competition for the teacher; any teacher compares favorably with a poor text but 
only the best will outshine Neiswanger. Seriously, he says so much so well that the 
teacher may be led to follow the text instead of leading and dominating the course 
as he should. The matter of length is an obstacle, also. For a one semester course he 
may find it necessary to omit more than Chapter 12 and parts of Chapters 9 and 14 
as Neiswanger suggests. 

In brief it is a pleasure to report that this revision embodies many improvements 


over an originally very good text. 


Statistical Yearbook, 1956. Eighth Issue. Statistical Office of the United Nations. United 
Nations Publications, New York, 1956. Pp. 646. Paper, $6.00; cloth, $7.50. 


Wiii1aM Lerner, Bureau of the Census 


HIs volume achieves a high degree of craftsmanship in statistical presentation. It 
Tis marked throughout by distinctive care and forethought. Its compilers resolved 
what must have been a great number of complexities. Yet, the resulting clarity and 
ease of comprehension with which its tables are endowed make it all seem deceptivel 
simple. = 

Consider the statistical devil’s bre w with which the Statistical Office of the U. N. 
must contend in preparing this volume. One hundred fifty-one different countries or 
territories contributed the statistics making up the 187 tables in the 1956 Yearbook. 
The broad categories of subject matter covered in the tables include all the basic ones 
such as population, agriculture, manufacturing, mining, construction, transport, 
trade, national income, finance, and some others. If we take the number of contribut- 
ing countries (making some allowance for the great range of differences in their sta- 
tistical literacy and competence); add in a factor reflecting the quantity and variety 
of information obtained; then, add in some notion of the minor multitude of special 
circumstances due to time, place, and other elements affecting inter- and intra- 
country comparability; and finally, make the presentation bilingual (English and 
French); the character of the achievement is bound to be impressive. 

There is such a vast, well-organized meld of information here that it would require 
at least a committee of experts for a technical evaluation of content, Paradoxically 
enough, it is difficult to escape the feeling that the handsome professional setting 
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(the well-designed tables and easy-to-read figures) which the Yearbook affords lends 
many of the figures more substance than they deserve. Even a brief acquaintance 
with certain of the aforementioned brew ingredients would tend to foster such a 
suspicion. Nevertheless, the question is one of feasibility and, within the limits of 
what is possible, the compilers of the Yearbook have seemingly spared few of the cor- 
rectives useful for leading both the wary and the unwary. There are many explanatory 
general notes and a multiplicity of excellently brief footnotes. The tables also include 
important signals (horizontal and vertical bars) to represent series breaks, and other 
symbols to minimize minor ambiguities. 

A fine device included in the basic population table is a type-of-estimate code 
presented in a column alongside the current population estimates for each country. 
The code identifies the method used to derive each estimate and thereby provides a 
substantial clue concerning its dependability. The importance of the code becomes 
apparent when it is noted that the bases of the population estimates range from 
“continuous population registers” (present in the Scandinavian countries) to “con- 
jectural estimates derived by other means than counting” (used for Liberia and 
Cambodia). Here indeed is real help for those who are less than expert and more than 
casually interested. 

Inevitably, there are one or two minor items concerning presentation about which 
one may quibble. From a functional standpoint, the heavy vertical rule used in the 
tables to partition the stub from the field seems inappropriate because it too sharply 
arrests the eye travelling from the stub to the figure columns. This use of the heavy 
rule is also inconsistent with its more meaningful use in the Yearbook tables for setting 
off the “total” column, and denoting a distinctive gap in a time series. 

On occasion, also, the footnotes to a multipage table are relegated to the end of 
the table. This is contrary to the general practice in the book of putting the footnotes 
on the page where they apply, which is certainly preferable. It minimizes the possi- 
bility of referring to the wrong note since the notes are fewer for a single page. It also 
eliminates the possible confusion and forgetfulness resulting from turning several 
statistics-laden pages (usually, six or more) to find the end of the table. This variation 
in practice with respect to footnotes may be attributable to problems created by 
other elements, such as space. 

As is evident from these comments, the Yearbook’s flaws are noticeable only be- 
cause they are in contrast to the surpassing merit of the book as a whole. The scope 
of the data in the Yearbook is probably unequalled by any single volume in the inter- 
national field. Anyone who has reason to be informed readily on international matters 
will surely find it an indispensable reference tool. 

Curso de Estadistica, Volumen III. P. Enrique Chacén, 8S. I. Volumen VI de Publicaciones 
de la Universidad de Deusto, Bilbao, Espafia: Editorial El] Mensajero del Corazén de 
Jests. Pp. xv, 319. 200 pesetas. 


Jorce Artas B., Universidad de San Carlos de Guatemala 


HIS volume contains applications of the theory covered in the two previous vol- 

umes, already reviewed in the December 1956 and September 1957 issues of 
Journal of American Statistical Association. The applications refer especially to sam- 
pling methods and statistical quality control. 

Sampling techniques are described in the first 56 pages, in the following five 
chapters: XLIX on sampling generalities: advantage, errors, bias and different 
sampling methods; L on random, cluster and systematic sampling, presenting at the 
end a comparative study of the variances of the three methods; LI and LII on strati- 
fied and multistage sampling; and LIII on sampling errors and a general comparison 
between the methods explained. 
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In general it can be said that in this part of the book are presented the methods of 
application of the most important theoretical aspects of sampling, always making 
reference to the two other volumes, but the practical problems of applying such 
methods are slightly touched, diminishing the usefulness of the book. Though for 
each method the author presents a problem, the reviewer feels the need of increasing 
the number of illustrations, even at the expense of the length of later chapters, such 
as those dedicated to activity analysis. 

The next six chapters (134 pages) treat industrial applications of statistics, with 
specific reference to quality control. Chapter LIV presents generalities on control 
charts, pointing out the differences between control and tolerance limits; LV and 
LVI present control charts for variables and attributes; the next three chapters cover 
control charts for number of defects, sampling for acceptance, and sampling by 
variables. 

The reviewer has the impression that the chapters in quality control possibly 
constitute the best presentation made on that topic in Spanish texts. The usefulness 
of such chapters could be increased by the inclusion of a greater number of examples. 

The last three chapters contain 62 pages, and include an introduction to activity 
analysis and a study of linear and nonlinear programming. These chapters appear to 
the reviewer to be outside the scope of the book. Chapter LXII treats the basis of 
linear programming, its relation to statistical games, application of Motzkin’s 
method for solving problems, the Hitchcock and the warehouse problems, and the 
dual, the simplex, and the graphical methods. The chapter on nonlinear programming 
does not study general problems. It refers to concrete cases of production problems 
under imperfect competition (monopoly and monopsony in joint production). The 
reviewer would prefer that the space occupied by these chapters be used, as was 
pointed out before, to extend the treatment of sampling methods, or for the inclusion 
of sequential sampling or of other applications of statistics to industry, such as a 
chapter on design of experiments. 

The book ends with a 46 page section of tables, including 12,000 random numbers, 
and three indexes: one of topics, one of symbols, and one of authors. 

Special reference should be made to the complete and up-to-date bibliography that 
increases the value of the text. 

After completing a review of the three volumes included in this statistics course, 
the reviewer still believes that this is one of the best works that has been written in 
Spanish on statistics. 


An Introduction to the Analysis of Time Series. Peter O. Steiner. New York: Rinehart & 
Company, Inc., 1956. Pp. ii, 94. $2.00. Paper. 


Joun I. Grirrin, College of the City of New York 


= small pamphlet, described as a “Preliminary Edition,” consists of three 
chapters concerned with the usual topics in time series analysis found in most 
textbooks on economic statistics. The chapters are titled “The underlying philosophy 
of time series analysis,” “Seasonal variation” and “Secular trend and decomposition.” 

The distinctive approach is found in the attempt to integrate time series analysis 
into the main line of statistical analysis. “The conviction that underlies these chapters 
is that the methods of analysis of time series are to be understood in the same general 
terms that apply to all statistical problems: the drawing of inferences about under- 
lying relationships on the basis of observed data.” A discussion in the three chapters, 
therefore, involves the establishment of an assumed model of the way in which the 
relationships generate observable data. The model is first specified and an artificial 
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series is constructed according to the model; the series is then decomposed by methods 
whose logic depends upon the model. 

The hypothetical time series, which is constructed according to the model 
R=SXTXCI, consists of ten years of quarterly data built up from a seasonal index, 
a cyclical-irregular index, and the values of a secular trend. This hypothetical series 
is then decomposed into the usual components. Actual data are limited to a series on 
Public Debt and a chart of the Index of Industrial Production. 

The specific techniques discussed in the analysis of seasonal variation include 
graphic examination of the series, use of moving averages, and construction of a 
seasonal index from the ratics of original items to the moving averages. The quarterly 
data are used in the illustration of the method with only a brief reference to the 
application in the case of monthly observations. In practice a student is more likely 
to be using monthly rather than quarterly data. 

Trend analysis, as discussed in the last chapter of this pamphlet, includes freehand 
smoothing, moving averages, and the method of least squares limited to the linear 
case. Reference is made to methods of successive approximation based upon examin- 
ing the deviations of the original items from the fitted line in order to determine if a 
trend is observable in the deviations. The author indicates that the next higher degree 
polynomial is fitted and the deviations are again examined until a polynomial is 
found for which the deviations exhibit no trend. This polynomial “is taken as the 
best approximation to the trend indication.” Acknowledgement is made in a footnote 
of the possibility of using other procedures for deciding when to stop raising the 
degree of the polynomial. Unfortunately no indication is given of the character of 
these procedures. Any discussion of the method of orthogonal polynomials is omitted 
on the ground that it is a device of computational significance only. It seems to this 
reviewer, however, that if the student’s attention is called to the problem of select- 
ing the appropriate degree of polynomial, then, in all fairness, warning should be 
included as to the validity of the procedures. In a pamphlet which stresses the purpose 
of integrating time series analysis and statistical inference this particular problem 
would provide an excellent testing ground. It has been my experience that students 
are quite capable of using orthogonal methods and determining the residual variance 
and applying tests. 

The particular function of this pamphlet puzzles the reviewer. If it is intended as 
an elementary course discussion, then the lack of real data in the examples is a 
serious omission. If the needs of more advanced students are to be served, the level 
of sophistication will have to be raised. 


Statistical Methods in Quality Control. Dudley J. Cowden. New York: Prentice-Hall, Inc., 
1957. Pp. xxiv, 727. Trade price $11. Text price $8.25. 


Acueson J. Duncan, The Johns Hopkins University 


nis book brings together under a single cover descriptions of a great number and 
 venae of statistical techniques useful in the control of quality of manufactured 
products. These descriptions include explanations of the theory underlying the tech- 
niques as well as the details of application. It is not a book that the casual student 
would read for general knowledge of quality control because of the great attention 
given to detail. A reader would do well to have some prior knowledge of elementary 
statistics and introductory quality control and an acquaintance with the calculus. 
The first ten chapters of the book discuss general statistical methods with illustra- 
tions in statistical quality control. These include concise discussions of probability 
and distribution theory, the theory of estimation and testing hypotheses, and ele- 
mentary analysis of variance. The next two chapters discuss statistical techniques 
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especially useful in testing amounts of variation and patterns of variation. Included 
are lambda and F tests for homogeneity of means and variances, the chi-square test 
of normality, control charts, ratios involving extreme values, the studentized range, 
autocorrelation, and run theory. Chapters 13 to 29 are devoted to a detailed discus- 
sion of statistical techniques of process control and chapters 30 to 40 are devoted to 
“product control” or what is more commonly called sampling inspection. Inter- 
spersed among these later chapters are chapters on the Poisson distribution, hyper- 
geometric distribution, and elementary correlation, and a section on covariance. 

Throughout the book emphasis is upon the use of the “best” statistical procedure. 
For example, in testing whether past data show lack of control, lambda tests of 
homogeneity of variances and means are recommended, with the ordinary control 
charts being treated as helpful supplementary devices. This is laudable from a sta- 
tistical viewpoint, but the quality control enzineer will probably still give top ratinz 
to his control charts since they usually give him more insight into the process than 
numerical tests of hypotheses. Nevertheless, this book may have a beneficial influence 
in improving statistical practice in quality control. 

The discussion of process control is very comprehensive. All kinds and varieties of 
charts are discussed as well as special statistical devices for testing for existence of 
control and determining statistical tolerances. Special attention is paid to the control 
chart for items as a recommended supplement to the usual X¥ and R charts. Also 
diseussed are control charts for standard deviations and variances, mid-ranges, 
extremes, and charts for nonnormal universes. The only control chart known to the 
reviewer that has not been included is that for individuals which makes use of the 
moving range (see American Society for Testing Materials, Manual on Quality 
Control of Materials, p. 105). Much space is devoted to the economics of process 
control where some interesting original material is presented. 

The discussion of techniques of process control is generally excellent and the fol- 
lowing isolated comments should in no way be taken as derogatory criticism of the 
book as a whole: 

(1) Although the author carefully distinguishes between testing past data for 
existence of control and determining whether the process remains currently in control, 
his chapter on operating characteristic curves for contro] charts does not state clearly 
that the discussion pertains solely to current control of the process. The OC curves 
discussed there give the chance of a single point falling outside the control limits. The 
operating characteristics of a chart used to assess the past existence of control are 
not treated in the text. (See, e.g., E. P. King, Annals of Mathematical Statistics, 
Vol. 23, pp. 384-95.) This may be because the author prefers lambda tests of homo- 
geneity for this purpose, although he does not give their operating characteristics 
either. 

(2) The initial p-chart in the book has a special form differing from the ordinary 
p-chart in that the points are plotted in tiers by groups. This may be most effective 
for the problem in hand, but is not a good example of the usually rather simple p-chart. 

(3) The assumption of normality in the discussion of statistical tolerances is not as 
emphatically stated as it should be. Departures from normality are most likely to 
occur far out on the tails and these are just the points with which statistical tolerances 
are concerned. The author could in this connection have included Wilks’ distribution- 
free procedure and noted the great price we have to pay in increased sample size when 
we cannot assume a specific distribution form. 

(4) The discussion of runs (p. 231) is somewhat misleading. To say that the proba- 
bility of a run of 7 above or below the central line is 2(1/2)’[=0.016] assumes that 
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the run starts at time 0. The probability of a run of 7 anywhere in a larger set of data 
will depend on the total number of points involved. For example, the probability of 
a run of 7 in a set of 20 points is about 0.05. Even for a chart used to maintain cur- 
rent control, it would seem to the reviewer that as runs occur they should be ap- 
praised as members of the whole set of data just past rather than as an isolated set 
of points. 

(5) Lastly, the conclusions drawn about the economic design of a control chart do 
not agree entirely with the conclusions reached by the reviewer in a similar study 
based on more extensive premises. (See this Journal, June 1956, pp. 228-42.) There 
is not space here to detail the differences, except to say that the reviewer did not 
find the frequency of sampling as sensitive to variations in conditions as Cowden did 
for his model. 

The chapters on lot sampling inspection plans cover the usual topics. The author 
introduces the subject with a chapter on variables sampling when the standard 
deviation is known and then proceeds to discuss single, double (for both the case of 
equality of the two rejection numbers and the case of inequality), multiple, and item- 
by-item sequential attribute sampling. Details are given for computations of OC 
curves for multiple as well as other attribute sampling plans, but the calculation of 
ASN curves for curtailed inspection is not included. There is also a description of the 
Dodge CSP-1 continuous sampling plan and of the Wald-Wolfowitz plans SPA and 
SPC. For reasons not very convincing to the reviewer the first two of these plans are 
discussed in the chapter on single sampling and the third in the chapter on double 
sampling. There is no discussion of the multi-level continuous sampling plans of 
Lieberman and Solomon, nor mention of the Navy Standard NAVORD OSTD 81 or 
the Army Standard ORD-M608-11. The section on attribute plans ends with a 
chapter on Military Standard 105A. A chapter follows on variables sampling with ¢ 
unknown and the book concludes with a chapter on Simon’s Grand Lot Scheme and 
Shainin’s Lot Plot. 

Although the discussion of acceptance sampling is of a high order, it is a little dis- 
appointing in at least two instances. The reviewer had semantic difficulties with 
some of the expressions employed. For example, the author introduces the term 
“product control” in contra-distinction to “process control,” the former to be ob- 
tained primarily by sampling inspection, the latter by control charts or other statisti- 
cal devices. To quote (p. 472): “The purpose of process control is to maintain a con- 
stant cause system turning out a satisfactory product. ... The purpose of product 
control is to see that the product is satisfactory regardless of whether the cause 
system is constant.” To the reviewer, the word “control” has become closely associ- 
ated with maintenance of a constant cause system, satisfactory or otherwise, and he 
dislikes the use of this word when applied to sampling inspection where the objective 
is primarily quality assurance. This is no place for an extended semantic argument 
and the reviewer simpiy wishes to object to the introduction of new terms and modi- 
fication of old ones, where commonly used expressions will do equally well. 

Secondly, the author throws out the abbreviation AQL replacing it by Pi, since it 
is easily confused with AOQL. (He cannot, of course, do this in the chapter on Mili- 
tary Standard 105A.) The whole concept of the Acceptable Quality Level is of such 
vital importance in many kinds of acceptance sampling plans (it may be the key 
figure in a contract) that to replace AQL by the meaningless expression P, is almost as 
bad us throwing the baby out with the bath. The reason for the author’s decision in 
this matter stems, it would seem, from his frequent reference to the plans originally 
proposed by the Statistical Research Group, Columbia University, and published by 
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the McGraw-Hill Book Company as Sampling Inspection. In these plans the AQL 
is merely the abscissa of the 0.95 point on the OC curve. This approach to the AQL 
has been out-dated since the appearance cf Military Standard 105A. 

The chapters on variables sampling are relatively meager in view of the growing 
interest in this subject. They do include the novel scheme of controlling the per cent 
defective through running acceptance tests on the sample standard deviation, the 
assumption being that means are stable from lot to lot, and variation in quality is _ , 
due to variation in the lot standard deviation. This might be an appropriate scheme 
when machines are easily set and maintained, but have widely different variabilities. 
The rest of the discussion is devoted to the well known ¥+ko or X¥+ks type of plan. 
Very little is said about the new minimum variance unbiased estimater offered by 
Lieberman and Resnikoff (this Journal, June, 1955, pp. 457-516) or about the very 
important use of the range in variables sampling. Also no mention is made of the 
Navy’s NAVORD OSTD 80 nor the Army’s ORD—M608-10. 

The book ends with 66 pages of tables and a summary of important symbols. 
There is no bibliography of references either at the ends of chapters or in the ap- 
pendix. This is unfortunate since the reader might want to follow many of the topics 
further. Also there are no problems for students. 

The reviewer found very few mistakes in the text. The most glaring is the state- 
ment on p. 158 that s, is an unbiased estimator of o. Actually, s,* is an unbiased 
estimator of o*. In general, however, the author is very careful to avoid this common 
slip. ‘un page 57 it would have been better to have said that normal curve areas are 
approximated by integrating expression (4). 

The most common typographical error is the omission or incorrect insertion of 
exponents. For example see formula 15.9, p. 220, and again on p. 223; also the oppo- 
site mistake in formula 6.36, p. 70. 

The mathematical derivations are in a few cases excessively long and awkward. 
For example, the derivation on p. 295 can be replaced by 1+28+438?+ -- - 
=d/d8(1/1—8) =1/(1—8)*. Again the derivation of the Poisson distribution (p. 410) 
can be shortened by simply cancelling out (n—d)!, substituting a=nP , and letting 
n+, 

On the whole this is a very fine book. The author has taken great pains in presenting 
concise explanations of a large variety of statistical techniques. The quality control 
engineer should find it exceptionally useful in presenting him with new and statisti- 
cally sound procedures for attaining process control. The teacher may find it a suit- 
able text for advanced courses, and certainly an excellent source of reference for 
statistical theory as applied to quality control. 


Quality Control & Applied Statistics. An abstract service. Robert S. Titchen, Arnold J. 
Roseathal, Bruce Bollerman, and Frank Nistico, Editors. Milton Terry and Grant Werni- 
mont, Advisory Board. New York: Interscience Publishers, Inc., 1956. Subscription rate 
$60.00 per year, binders $5.00 each. 


Exus R. Orr, Rutgers University 


HE first issue of this abstracting service was published in June, 1956. The first 
volume consists of 12 issues which contain approximately 1,000 to 1,200 pages. 
Each volume will “bring together in compact, loose-leaf form all the important new 
results from the world literature in the field.” 
“The technical worker in quality control or in applied statistics must also keep 
abreast of neighboring fields, such as operations research, industrial engineering, 
management of quality control, quality control instrumentation.” In order to ac- 
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complish this stated objective, the editors propose to scan some 400 journals and 
abstract those selected articles in such a way that it will usually “not be necessary to 
consult the original published paper.” 

A listing of the journals reviewed in Volume I, Issue No. 1, includes periodicals 
from many countries, including the United States, Canada, England, India, Japan, 
France, Belgium, Russia, Germany, Spain, the Netherlands, and others. 

The loose-leaf format and the assignment of a double classification to each abstract 
(by methodology and by field of application) permit filing the abstracts in several 
ways. The editors indicate that an annual index will furnish a complete guide to the 
abstracts published each year. The main headings of the principal classifications are 
as follows (each of these principal classifications has many sub-classifications): 

100, Statistical Process Control; 200, Sampling Principles in Plans; 300, Manage- 
ment of Quality Control; 400, Mathematical Statistics and Probability Theory; 500, 
Experimentation and Correlation; 600, Managerial Application; 700, Measurement 
and Control. 

Another classification has the following headings: A, Process Control in Manufac- 
turing & Production (with 14 categories listed); B, Marketing and Promotion; C, 
Banking, Finance, and Insurance; D, Research and Development (again with the 
same 14 categories including mechanical, electrical and electronic, chemical and 
pharmaceutical, and many others); E, Military, Naval and Government operations; 
F, Social Sciences, incl'iding Psychology & Education; G, Office Management; 
H, Agriculture; I, Medicine and related fields; J, Public Utilities and Transportation; 
K, Earth Sciences; L, General Administration; Y, General Application; Z, Other. 

A page of directions on available methods of filing is also included. 

Each article which is abstracted includes, besides the title and author, such bits of 
information as: the purpose, a summary, results, references, and abstractor. A very 
satisfactory photo-offset printing process is used. 

This is a very ambitious project and should be a valuable source of information 
on what is being published in quality control and applied statistics. 


Sampling Theory of Su-veys with Applications. Pandurang V. Sukhatme. New Delhi: ‘The 
Indian Society of Agricultural Statistics; Ames, lowa: The Iowa State College Press, 1954. 
Pp. xix, 491. $6.00. 


INGRAM OLKIN, Michigan Staie University 


HIs book is an outgrowth of lectures which the author has delivered since 1945 
at various institutions in India and at Iowa State College. The author states 
that the book is primarily designed to serve the needs of a text for teaching an ad- 
vanced course in sampling theory of surveys and the needs of a reference book. 
Chapter I is an introduction to the basic ideas in sampling. Chapter II deals with 
the theory of simple random sampling and sampling with varying probabilities of 
selection. The partitional notation is developed and the theorem that the expectation 
of a monomial symmetric function of the sample observations is the corresponding 
function for the population with a certain multiplicative constant is proved. This 
permits the evaluation of V(s*) as well as approximations for the mean and variance 
of a ratio estimate. The hypergeometric distribution is also discussed in some detail. 
Chapter III covers stratified sampling and is divided into two parts. The standard 
material when the selection is made with equal probabilities is covered in the first 
part. Sections on the use of stratum sizes for improving the precision of an unstratified 
sample, the effect of increasing the number of strata on the precision of the estimate, 
the effects of inaccuracies in strata sizes are also included. The second portion com- 
prises the theory when the selection is made with varying probabilities. 
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Ratio and regression estimates are the subjects of Chapters IV and V, and include 
approximations to E(9/Z) and V(9/2) to O(n), a treatment of ratio estimation for 
qualitative characters divided into K classes, a discussion of weighted regression 
estimates with a comparison between weighted and simple regression, double sample 
and successive sampling. 

Chapter VI is entitled “Choice of Sampling Unit” and deals with cluster sampling 
in the case of equa) and unequal clusters. Sub-sampling with equal and unequal selec- 
tion probabilities are the contents of Chapters VII and VIII. Two- and three-stage 
sampling with equal first-stage units, two-stage sampling with unequal first-stage 
units, and stratified sub-sampling are included in Chapter VII. Determination of 
optimum probabilities, a discussion of ratio estimates, sub-sampling without replace- 
ment, collapsed strata, and sampling without replacement at each stage form a part 
of Chapter VIII. 

Chapter 1X deals with systematic sampling including a discussion of two-stage 
sampling with equal and unequal units and systematic sampling of second-stage 
units. The final chapter covers non-sampling errors, arising from observational errors 
and incomplete sampling or non-response. 

Certain sections have been marked with asterisks to indicate that they may be 
omitted upon a first reading. It is just these sections (about 10% of the book) which 
distinguish the book as an advanced text. 

Each chapter is treated anew in that the algebraic development begins from first 
principles. This may be advantageous when using the book as a reference, but it is a 
bit tedious in a text for a course. For example, the problem of optimum allocation 
occurs in many phases of the theory and could be handled in one stroke. 

There are numerous examples which are helpful, but no exercises. From the point 
of view of a reference or advanced text, the bibliography is somewhat scant. 

Since this book follows on the heels of several recent books on the subject, some 
comparisons seem inevitable. The organization of the contents follows most closely 
that of W. G. Cochran’s Sampling Techniques. This reviewer found Sukhatme to be 
preferable if used for an advanced course in which there is emphasis on a complete 
algebraic treatment and for which the students are moderately well prepared. How- 
ever, it is not as lucid in many of the descriptive sections. 

In conclusion, despite a number of drawbacks, the inclusion of many advanced 
topics and the very extensive treatment given in certain sections make this book a 
welcome addition. This reviewer feels that it may also be used to great advantage as 
reference material and as supplementary reading in courses which are at a modest 
level. 


Tables of the Non-Central t-Distribution. George J. Resnikoff and Gerald J. Lieberman. 
Stanford, California: Stanford University Press, 1957. Pp. 389. $12.50. 


B. L. Weucu, University of Leeds 


HE non-central ¢ statistic is defined by the ratio of (+6) to ./w where z is a unit 

normal deviate, w is distributed as x*/f with f degrees of the freedom and 4 is the 
non-centrality parameter. The distribution of t is required in the solution of a number 
of problems arising when a normal distribution with mean y and standard deviation ¢ 
is being sampled, for instance in drawing inferences about certain simple functions 
g(u, «) of the population parameters. Some of these problems are described in the 
introduction of the present book of tables. It is the nature of most of the procedures 
depending on the use of the non-central ¢ statistic that the normality of the parent 
population is a serious assumption. Nevertheless it is important that simpler methods 
should precede more complex ones and that a thorough tabulation of normal theory 
distributions should be made. 
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The distribution of t depends on f and é and any full table of the probability density 
or of the probability integral must therefore be one of triple entry. Resnikoff and 
Lieberman give such tables covering 28 different values of f and, for each f, covering 
explicitly 10 different values of 6 (although not the same set of 5’s at each given f). 
For each one of these 280 combinations of f and 6 a detailed tabulation of the density 
function and the cumulative probability function is given to four decimal places 
against values of the argument z = tf/—"”” at intervals of 0.05. Quite apart from the direct 
applications for which the tables are designed, they are of further theoretical interest 
in that they provide information relating te a wide variety of frequency curves of 
different degrees of skewness, and long-tailedness. Short tables of percentage points 
are also provided at the end of the volume. 

The reason for the choice of values of 6 which are not the same for all f lies in the 
nature of the more important applications, the interesting values of 6 tending to in- 
crease with ./(f+1). Some users might have preferred values of 5 other than the 
specific ones the authors have provided, but clearly the choice of as few as 10 values 
is a very difficult one to make, and the choice of more than 10 values would have led 
to a volume impracticably large. 

The tables were computed de novo on the IBM Card Programmed Computer, 
Model II, by generating the probability density function on the machine directly and 
then evaluating the cumulative function by numerical integration. The authors report 
good agreement in spot-checks with previous tables which were calculated by entirely 
different methods and presented in a very different manner. The present reviewer 
finds the tables easy to read although he would have welcomed larger print for the 
introductory matter and some further discussion of methods of interpolation in the 
6-direction. 

Resnikoff and Lieberman and the Stanford University Press are to be congratulated 


on the organization and successful conclusion of this enterprise. 


Probability: An Intermediate Text-book. M. T. L. Bizley. New York: Cambridge Univer- 
sity Press, 1957. Pp. vii, 230. $4.00. 


K. L. Cuune, Syracuse University 


HIs is a pleasant little book in format and content. Six of the seven chapters, about 

80 per cent in number of pages, deal with so-called “permutations and combina- 
tions.” One chapter treats Waring’s theorem (1792), a particular but essential case of 
which is frequently referred to in other books as Poincaré’s formula. This useful 
method is well illustrated by several worked examples. Another chapter illustrates 
the classical method of difference equations, but generating functions are not intro- 
duced. There is a chapter on runs which goes somewhat beyond the most elementary 
problems. The chapter on continuous variables is concerned mostly with simple 
geometrical situations as points on a line or circle. In the last two sections limiting 
probabilities of the simplest kind are illustrated including a derivation of the ex- 
ponential waiting time distribution. 

The fundamentals of probability theory are handled in a heuristic but enlightened 
way based on intuitive set theory. The working definitions and secondary proofs are 
well done. No deeper concepts are needed since such usual topics as the law of large 
numbers and the ceniral limit theorem, even in the Bernoulli-DeMoivre form, are not 
discussed. Indeed neither the normal nor the Poisson distribution is mentioned. 

The best part of the book is the large number of good examples, both worked and 
otherwise; often these are exhibited in a number of “vaviations.” The careful discus- 
sion of certain common pitfalls (see e.g., Example 1.6 and Section 2.4) will also be 
helpful to the beginner. 








590 _ AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1957 


The book is intended for actuarial students. One wonders, on comparing it with 
Cramér’s The Elements of Probability Theory and Some of Its Applications (Wiley 
& Sons, 1956), written originally for the same clientele, what these students really 
want of probability theory. Cramér’s book contains hardly any permutation-combi- 
nation but is “all about” the normal distribution. The general student of statistics 
who has difficulties with permutations and combinations may find this book a useful 
supplement to more orthodox texts. 


Ce.aputing with Desk Calculators. Walter W. Varner (Head, Flight Simulation Laboratory 
anc, Computers, Convair Astronautics). New York: Rinehart and Company, Inc., 1957. 
Pp. 108. $2.00. Paper. 


Rurs Zucker, National Bureau of Standards 


HIS is a general instruction manual for the beginner who will work in a computing 
laboratory, but it should also prove useful to the experienced computer as a ref- 
er ue. 

“be manual lends itself to independent study and classroom work. There is a sepa- 
ruve section of examples on perforated sheets which can be removed easily. Answers 
to odd numbered problems are given, while answers to even numbered preblems are 
listed in a separate pamphlet. 

The first few chapters on arithmetic operations are similar to those found in 
machine instruction booklets. The later chapters are devoted to the calculation of 
roots, manipulations with approximate numbers, interpolation, and statistical com- 
putations. 

In the chapter on statistical applications, examples are given of the proper use of 
desk calculators in computing the arithmetic mean, the standard deviation, and the 
Pearson coefficient of correlation. Since square roots occur frequently in these 
formulas, several methods are given to obtain them either directly, or indirectly by 
iteration. 

Linear interpolation, direct and indirect, and higher order interpolation like La- 
grange’s and Newton’s binomial interpolation formulas are included and illustrated. 
A table of Newton’s binomial interpolation coefficients is given and also a list of fre- 
quently used constants. 


Introduction to Operations Research. C. West Churchman, Russell I. Ackoff, and E. 
Leonard Arnoff. New York: John Wiley & Sons, Inc., 1957. Pp. x, 645. $12.00. 


THORNTON Pace, Operations Research Office 


" peescae research came into being about a decade ago because of the rapidly 
growing complexity of technical decisions. During World War II it was applied 
extensively to military problems, and since the war it has found increasing use in both 
military and industrial applications. Over five years ago a summary of military opera- 
tions research was published! and there are now three journals in English devoted to 
the subject? and two or three more published in other languages, but the burgeoning 
new science has suffered from the lack of a well-organized and comprehensive text- 
book. 

In Introduction to Operations Research, the industrial applications of operations 
research are presented in logical and precise form; this well-written volume will be of 





! Operations Research by P. M. Morse and G. E. Kimball. New York: Technology Press of Massachusetts Insti- 
ture of Technology and Wiley, 1951. 

2 Operational Research Quarterly, published by the Operations Research Society, London; OPERATIONS 
RESEARCH, published by the Operations Research Society of America; and Management Science, published by 
The Institute of Management Sciences. 
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tremendous value in unifying the subject and stimulating its orderly extension; it 
will improve teaching and serve research workers as a useful reference. Of course, an 
up-to-date companion volume is needed to cover military operations research, much 
of which is unfortunately shrouded in secrecy at present. The authors have wisely 
omitted discussing military applications, which differ in many respects from indus- 
trial applications, but because of this omission the title should more properly have 
been “Introduction to Industrial Operations Research.” 

The order of presentation follows the steps generally involved in operations re- 
search: formulating the problem, constructing a model, solving the model, testing, 
and control. This sequence is preceded by an introduction on the nature of operations 
research and followed by a section on administration of operations research activity. 
The discussion of models includes sections on inventory models, allocation models 
(linear programming), waiting-line models (queueing theory), replacement models, 
and competitive models (theory of games). Each of these sections is well illustrated 
by examples; for instance, optimizing stock levels, optimizing purchase lots (with price 
breaks and with various nonlinear restrictions), minimizing transportation costs, 
optimizing the utilization of processing facilities, optimizing the manning of factory 
tool cribs, minimizing traffic delays at toll booths, replacement of defective com- 
ponents, contract bidding, and production control in two manufacturing industries. 

The authors have been highly successful in using mathematics to illuminate with- 
out obscuring the reasoning. There are very few complex mathematical derivations; 
the examples are worked out in elementary detail, so that a reader familiar with first- 
year calculus can readily follow. 

Introduction to Operations Research clearly shows the place of statistics in operations 
research; although there is no one section devoted to probability and statistics as 
such, the concepts of statistics are used throughout, and of course particularly in the 
sections on queueing theory, replacement, and control. The statistician is not likely 
to learn much about statistics from this book, but he will learn a great deal about its 
applications to a wide variety of practical problems. 


System Engineering: An Introduction to the Design of Large-Scale Systems. Harry H. 
Goode and Robert E. Machol. New York: McGraw-Hill Book Company, Inc., 1957. Pp. xii, 
551. $10.00. 


James F, Diasy, The Rand Corporation 


Wr can one find a book that gives a unified treatment to the wide range of 
subjects that comprise modern system design? The present work makes a useful 
early entry in the current spate of books, most of them as yet unpublished, addressed 
to the generalists who design complex systems and lead mixed-team researches, and 
to their heirs-apparent. Even if the reader is not responsible for the design of large- 
scale systems he will find this book an invaluable aid to the full enjoyment of conven- 
tion cocktail parties and a considerable help in the generation of brilliant technical 
conversation at these affairs. The present work provides a nodding acquaintance with 
such old staples as Zipf’s Law, Nim, and knob design. It also touches on such dis- 
parate, but interesting, subjects as: the Ottawa Post Office’s automatic mail sorters, 
the Bell System’s No. 5 Crossbar System, photoformers, experiments in pattern 
recognition, therbligs, bistatic radars, the redundancy of the English language, Alex 
Bavelas’s experiments in group dynamics, foveal extent and blind spots, mercury 
delay lines, the Nyquist criterion, and kill curves for randomly assigned missiles. 
It is clear that a book of this nature cannot provide more than a sketchy treatment 
of each topic. This makes it necessary to inquire into the wisdom of the decisions to 
include and to omit material, into the balance of the whole book. It seems to this 
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reviewer that the authors were notably successful on this score. They have not only 
achieved good balance, but have done a good job of tying the whole together. Interest 
is maintained, and the book is quite readable. 

Keeping in mind that the balance is, on the whole, quite good, it is possible to note 
four exceptions. First, there is only passing acknowledgement of a major design prob- 
lem: how to design a system in the face of real uncertainties (like contract changes or 
the invention of a much better device during system development). Not much at- 
tempt is made to differentiate between real uncertainty and statistical fluctuation. 
Secondly, the treatment of economics and costs is weak. Thirdly, there is no recog- 
nition of a valuable design tool that lies between analytic mathematical models and 
simulation with computers: this tool goes under names like operational gaming, 
command-post exercises, and map exercises. Fourthly, there is insufficient treatment 
of system reliability and the recent tools for its analysis. Nevertheless, on the whole, 
the authors have achieved a good selection of material, and deserve the more credit 
because their book is the first of its kind. 

Teachers of senior or graduate courses in system engineering should find this an 
excellent text. It could also be used to give a one-semester smattering of probability 
and statistics to students not majoring in mathematics; for this purpose it should be 
bolstered by a teacher willing to supplement the material with extra examples and 
derivations. There are 31 chapters in the book, and 10 of these deal primarily with 
probability, statistics, or related topics. These chapters are: Fundamental Notions 
[of Probability]; Distributions of Discrete Variables; Distributions of Continuous 
Variables; Characteristics and Distributions of Statistics; Stability and the Laws of 
Large Numbers; Design of Experiments—Data Gathering; Analysis of Experiments 
—Mathematical Statistics; An Example of Exterior System Design [a Dial System]; 
High Traffic—Queueing Theory; and Communications—Information Theory. 

Seven chapters deal with digital and analog computers in a reasonable and inter- 
esting way. Seven other chapters deal with such topics as game theory, linear pro- 
gramming, simulation, servomechanisms, human engineering, costing, system testing, 
and the management of system design activities. The remaining seven chapters deal 
with the suthors’ own concept of large-scale system design. These chapters have evi- 
dently benefited from their extensive experience and are enlivened by numerous well- 
chosen examples and anecdotes. 


Income and Wealth, Series V. Simon Kuznets, Editor. London: International Associatioa 
for Research in Income and Wealth, 1955. Pp. xiv, 242. 42 shillings. See review article on 
pp. 450-57. 

The Inter-Industry Flow of Goods and Services, Canada, 1949. Reference Paper No. 72. 
Dominion Bureau of Statistics. Ottawa, 1956. Pp. 52. $1.00. Paper. 


W. Duane Evans, Bureau of Labor Statistics 


HE publication of this report piaces Canada, with many other countries, ahead of 
the United States, where this approach was originated but where the latest avail- 
able comparable data refer to the year 1947. 
The objectives, methodology, and results of the study are competently set out, in 
a general format which will be familiar to readers who have examined earlier U. 8. 
publications. The first half of the paper gives a general description of the purposes, 
concepts, approach, potential uses, and conventions of the study, followed by a 
bibliography which is not exhaustive but appears very useful. The remainder of the 
paper is taken up by a technical appendix, which explains in more detail the sources 
of data and methods used in making estimates. Three tables are folded in the back 
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cover, a conventional “input-output” table in monetary terms, a table of input ratios, 
and an output distribution table in ratio form. 

The principal practical difference between this and the U. S. 1947 tables is in the 
degree of industry and product detail used and hence available for analytical pur- 
poses. The U. 8. tables recognized about 450 industries, or “sectors,” with additional 
product information. The Canadian tables distinguish 42 processing sectors, a level 
intermediate between the “two-” and “three-digit” levels of classification. The au- 
thors remark that this limits the value of the tables for market analysis. Inevitably 
someone will use the U. 8. and Canadian tables together to analyze structural differ- 
ences between the economies of the two countries, and will be puzzled in many in- 
stances to determine whether contrasting entries reflect simply heterogeneity within 
broad aggregations or more fundamental differences. 

The principal conceptual difference is that the Canadian tables are expressed in 
“purchasers’ prices,” rather than in “producers’ prices,” as has been common in the 
U. 8. A reason for this is that the Canadian tables essentially started with the na- 
tional accounts as given, the additional work being devoted to an extension of them. 
In the U. S., the national accounts were included among the figures to be checked 
within the broad outlines of the input-output tables. A minor difference is that in the 
Canadian tables all imports are treated as “non-competitive,” while there was some 
additional flexibility in the U. S. tables. 

On balance, this study is well done and well presented. It will be more useful for 
descriptive than for analytical purposes, but it may be hoped that it will serve as a 
starting point for more detailed work. 


British and American Manufacturing Productivity: A Comparison and Interpretation. 
University of Illinois Bulletin Series, Bureau of Economic and Business Research, No. 81. 
Marvin Frankel. Urbana: University of Illinois, 1957. Pp. 130. $1.50. Paper. 


Irvine H. Srecen, U. 8. Council of Economic Advisers 


H1s book presents and seeks to explain estimates of past, recent, and prospective 

British-American labor productivity differentials for manufacturing. It is a credit- 
able contribution to a field that has been cultivated more intensively by students on 
the other side of the Atlantic than in the United States. Its major statistical accom- 
plishment is extension to the postwar period (1947-48) of estimates of the kind as- 
sociated with the name of the late Laszlo Rostas, author of Comparative Productivity in 
British and American Industry (1948). As an interpretative study, it fits into a long 
tradition recently infused with new life by the reports of 66 British productivity 
teams that visited the United States under the Marshall Aid program and by Graham 
Hutton’s evaluation of these reports in We Too Can Prosper (1954). 

Frankel is well aware of limitations of the available data and of the methods he 
employs in deriving his postwar differentials for 34 manufacturing industries. He 
acknowledges that these intercountry productivity ratios “at best... represent 
rough approximations.” His appraisals of the significance of capital, establishment 
size, and market size as factors in explaining the comparative productivity levels are 
also subject to heavy discount. For example, he takes fuel input as a measure of capi- 
tal; and he uses other surrogates (e.g., output for establishment size and market size) 
that are not definitionally independent of productivity. He concedes that the spectre 
of spurious correlation haunts the whole postwar statistical analysis. 

In addition to estimating and analyzing postwar productivity differentials, Frankel 
makes some mechanical extrapolations forward to 2000-2025 and backward to 1830 
(at which date both countries presumably had equivalent manufacturing output per 
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worker). He upholds the thesis that output expansion associated with population and 
labor force growth is itself a stimulant to productivity advance. He also discusses 
factors pertinent to the understanding of differentials observed in “cross-sectional” 
and “developmental” studies. Frankel’s (and others’) literary treatment of explana- 
tory factors would benefit greatly from organization around an explicit model of 
causation, interaction, and development that is conceptually complete, although per- 
haps quite general; that specifies a small number of broad but exhaustive categories of 
pertinent factors, allows for the participation of apparently “passive” factors or “con- 
ditions,” and provides for emergence of new factors through interplay of the original 
ones. 

Despite reservations already mentioned and others indicated below, and regardless 
of the degree of acceptance accorded to the author’s opinions and findings, the book 
merits the attention of serious productivity students. Like this reviewer, some readers 
may wish there were a more intensive introductory discussion of the conceptual prob- 
lems of international productivity comparison; measurements based on net output 
(say, value added in constant prices) as well as gross output; and appendix tables and 
notes on the raw data underlying the productivity computations for the two countries. 
This reviewer would also have liked to see a reference to the theoretical applicability 
of the Leontief-Evans input-output approach to “cross-sectional” studies of compara- 
tive productivity performance; the inclusion of unweighted geometric as well as un- 
weighted and weighted arithmetic means of industry productivity differentials; the 
inclusion of weighted as well as conventional unweighted Pearsonian correlation co- 
efficients; and the limitation of the term “subproducts” to a concept not discussed in 
the book (viz., to the characteristic output of only one step or sequence of steps in a 
complete manufacturing process). And at least Daniel Creamer may care that his last 
name is not spelled correctly even once! 


American Housing and Its Use. Louis Winnick. New York: John Wiley & Sons, Inc., 1957. 
Pp. xiv, 143. $5.50. 


Mrizs L. CoLEan, Washington, I D.C. 


I this small, fact-packed volume, Louis Winnick has turned out a feat of statistical 
virtuosity, playing the 1950 Housing Census like a great organ, extracting from it 
subtleties of meaning and interpretation that will be stimulating—and perhaps even a 
little upsetting—to the most jaded student of these data. The skill with which the 
author uses his wide knowledge to bring material from other sources to elucidate an 
occasional lacuna in the census and the ingenuity with which he develops new inter- 
pretive devices (such as the multiple correlation of income, rent, and the utilization 
of space in Chapter 5) deserve both applause and careful study. 

Interesting and resourceful as are the author’s techniques, it is still the substance 
of the volume which, despite its late date, gives it its special freshness. Every chapter 
has some new clarification or some jostling of an old preconception. We learn, for 
example, at the very outset that although we are the best housed nation in the world 
(in terms of space per person) and altnough our housing represents about one-fourth 
of our reproducible national wealth, we apparently do not as a people place the im- 
provement of our immediate living environment at the peak of our desires. 

Winnick finds only a slight improvement in the availability of space per person 
since the beginning of the century, despite the enormous rise in real income during the 
intervening period. Moreover, he finds that availability does not increase greatly with 
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rising income. A few passages to these points may be cited: “ . . . the simple correla- 
tion between rent and income is not very high.” (Page 43) “ . . . the increase in hous- 
ing expenditures does not keep pace with the improvement in income.” (Page 44) 
“The distribution of housing space in 1950 was remarkably even, far more so than the 
distribution of income and probably more equal than is the case of any other major 
economic asset.” (Page 8) 

These are facts that the building industry needs to ponder deeply in reviewing the 
state of its market. While the author notes that in the last decade some increase in 
the influence of income on the amount of space utilized may be discerned, the influence 
of the size of the household and the maturation of children remain predominant. In 
short, the industry seems still to have a long way to go in selling housing per se in 
competition with the other demands for the consumer’s dollar. 

The careful analysis that is given to some special aspects of the market produces 
significant conclusions. This is especially true of the subjects of the one-person house- 
hold and the living arrangements of the elderly. The proportion of one-person house- 
holds is shown to have nearly doubled since 1900, an increase unequalled in any but 
the two-person group. The main reason for this seems to be the growing unavaila- 
bility of space for lodgers (due to the decrease in size of the dwelling unit); and, while 
the increasing independence of young adults, especially women, is often assumed to 
be the source of new one-person households (and is indeed an important source), the 
typical such source remains single elderly women, generally renters, and usually poor. 

At the same time, Winnick finds no evidence that older people form independent 
households much more frequently than in the past, although increase in the absoiute 
number of these persons has given them a greater weight in the total market. Winnick 
suggests this area, with the number of anomalies that he discovers in it, to be par- 
ticularly needful of further research. 

The findings on overcrowding also are noteworthy. Surprisingly, the lowest income 
groups are not the most crowded in their housing accommodations. Rather it is the 
lower side of the middle income range that tends to be compressed; while, generally 
speaking, size of household rather than income is the critical influence. “Large house- 
holds with fairly high incomes,” Winnick concludes (Page 8) “are often more crowded 
than small households with modest means.” And again (Pages 39-40), “ .. . a typical 
household of, say 3 or 4 persons, having in 1950 probably twice the real income of its 
1934 counterpart, was found to be occupying less, rather than more, space.” Here is 
something else for the housebuilding industry to ponder. Has the efficiency of the 
typical house so increased that less space actually means more comfort? Or has the 
cost gone up disproportionately to income? Or, once more, have other consumer goods 
been more successful in bidding for the housing dollar? 

In contrast to the usual assumption, racial discrimination as a factor in itself 
apparently is not as significant as often supposed, since, income class for income 
lass, negroes appear to occupy as much space as whites. Discrimination shows up 
in the poorer quality of housing generally available, despite income—another factor 
of significance to the future market. 

This reviewer has attempted to give only a few samples of the book’s riches, and 
he has focused these samples on their market aspects, not altogether because of his 
blindness to their social implications, but because he would incite industry to a greater 
use of such data in accommodating its potential customers. The book, however, has 
its value for all students of housing not only in its interpretation of the last Census 
but also in the aid that it may give in adding to the significance of the Census of 1960. 
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Year Book of Labour Statistics, Sixteenth Issue. Geneva: International Labour Office, 
1956. Pp. xv, 503. $5.00 paper, $6.00 cloth. 


KennetH O. ALEXANDER, Michigan State University 


HE form and content of this sixteenth publication of international labor statistics 

are similar to preceding issues. Text, tables, and notes are all given in English, 
French, and Spanish. Material is broken down by subject, each subject constituting 
a chapter. In all chapters, the statistics of the various countries are set forth in a 
consistent order. 

Content of the eleven chapters is as follows: I, Total and economically active popu- 
lation (breakdowns by sex, age, occupational group, and industry). II, Employment 
(general indices as well as indices for industries and industry groupings, absolute 
figures by industry). ITI, Unemployment (genera! figures and breakdowns by in- 
dustrial or occupational groups). IV, Hours of work (average hours worked generally, 
by industry, by industry groupings and by occupation). V, Wages and labor income 
(male and female average earnings generally, by industry groupings, by specific in- 
dustries and by occupation; statistics on total wages and salaries and labor income.) 
VI, Consumer price indices and retail prices (a total “cost of living” index and indices 
of some major commodity groupings, retail prices in native currencies for specific 
consumer goods for the various countries and their major cities). VII, Family living 
studies (statistics on sources of family income and family consumption expenditure, 
with more detailed information on patterns of food expenditure). VIII, Social Security 
(coverage, number of beneficiaries, annual receipts and expenditures). [X, Industrial 
injuries (fatal injury rates for manufacturing and railways, both fatal and nonfatal 
rates for mining). X, Industrial disputes (“industrial disputes which resulted in a 
stoppage of work, and the number of workers involved and working days lost”). 
XI, Migration (emigration and immigration). Appendix I contains indices of in- 
dustrial production and wholesale prices and a table of exchange rates in terms of 
the U. 8. dollar. Separate data for the Soviet Union, on population, the industrial 
distribution of labor, retail prices, labor income, etc., are contained in Appendix II. 
These Russian data were taken from a 1956 publication of the Central Statistical 
Administration of the Council of Ministers of the U.S.8.R. Finally, there is a list of 
publications of various countries which served as statistical sources. 

The thorniest of problems, for both the formulators of this volume and its users 
lies in the question of the comparability of the statistics. It is an old problem for 
the ILO and concern for it permeates this edition, as it has previous editions. The 
chapter introductions caution the reader concerning international differences in defini- 
tions, coverage, statistics-gathering methods, etc. Reference is made to a number of 
ILO and UN publications dealing in more detail with the problems and methodology 
of gathering international statistics on a number of subjects. All this is a lesser prob- 
lem for the user restricting himself to a single country, but even here differences over 
time in techniques, definition, etc., warrant extreme caution. Such a user should take _ 
care to point out at length the differences from statistical practice in his native 
country, lest conclusions of substantive importance become hopelessly confused with 
differences of a procedural nature. 

Often the data can give only a broad, general picture, subject to qualifications and 
limitations. But these volumes represent an honest, painstakingly thorough approach 
to the difficult task of gathering international statistics. On many topics they are the 
sole source for most of us, the most convenient source for all of us. 

As a final practical note, the cloth bound volume seems a much wiser purchase, 
considering the narrow price differential. Only a small amount of thumbing was suf- 
ficient to separate the cover from the binding of this reviewer’s paper bound copy. 
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Work, Workers, and Work Measurement. Adam Abruzzi. New York: Columbia Univer- 
sity Press, 1956. Fp. xvi, 318. $7.50. 


GerHARD Bry, Rutgers University 


M“ dictionary defines the term “provocative” as “tending to provoke, stimulate, 
incense, stir up discussion.” The book under review well deserves this essentially 
complimentary epithet. Abruzzi inveighs against “those—and there are all too many 
—both in management and labor who have a direct interest in retaining time study 
more or less in its present form.” He believes that much of current theory and practice 
in the time study field is bargaining-oriented and inadequate for gaining insight into 
human work processes. He aspires to put work measurement on a truly scientific 
basis. He also attempts to lay the foundation for a theory of human work. These are 
formidable goals. Obviously, a group much larger than the small band of time-and- 
motion engineers will be interested in such an undertaking; readers of this Journal 
will be particularly interested in Abruzzi’s pioneering efforts to apply the principles 
of statistical inference to the analysis of work performance. 

The author begins his survey of current theories and practices in the field of work 
measurement with a general methodolozical discourse in which very rigorous stand- 
ards for scientific inquiry are set forth. Small wonder that time-and-motion study 
approaches currently in use rarely measure up to these standards. This reviewer was 
impressed by Abruzzi’s account of the extent to which work measurement practices 
are deficient because of the use of inadequate instruments, primitive concepts, un- 
reliable measurements, and slanted evaluations. But it is not so much the detail of 
specific ideas or procedures that arouses the author’s wrath as it is the whole orienta- 
tion of time study toward use in wage setting and collective bargaining and the effects 
of the resultant bias upon all concepts and procedures currently in use. He feels that 
the mixing of the functions of “estimating” and “evaluating” plays havoc with the 
goal of scientific objectivity in work measurement. Many of the author’s critical 
forays against current work measurement procedures seem adequately justified. Some 
of his appraisals, however, would be more persuasive to this reader if they were stated 
in a less bellicose fashion. 

The middle part of the volume describes the author’s conception of a scientific 
approach to work measurement. He does not desire to set up “standards” or to 
measure the performance of different workers. He wishes to shed light on the process 
of work. Interpretation of evidence on this process may form the basis of managerial 
action. However, first the principles of proper work measurements must be developed. 
Abruzzi’s approach is fashioned largely after process analysis as applied to quality 
control. Control charts are used, along with control limits, sequential analysis, 
standards of stability, and so forth. The author’s use of statistical techniques is fairly 
orthodox. Essentially, the analytical problem consists in distinguishing between 
variations that can be attributed to assignable causes and those that are of random 
character. Stability of work performance is attained when the performances measured 
are within three standard deviations of their averages and when there are no continu- 
ous nonrandom runs. Significant instability requires interpretation and, possibly, 
remedial action. 

How successful is the application of the principles of process analysis to the 
“quality control” of human labor input? There is no doubt that the fundamentals of 
quality control have some application to the analysis of work performance. The 
author recognizes that there are important differences between the characteristics of 
work input and of machine output. For one thing, no counterpart exists, in work 
measurement, to the product specifications that set exogenous limits to the tolerable 
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quality variation of machine output. Also, the empirical patterns may be materially 
affected by the scope of human spontaneity and the wide variety of work situations. 
In this connection, it is regrettable that Abruzzi had to restrict his analysis to two 
firms in a specialized section of the apparel industry. The exposition is, in fact, a 
condensed version of his previous book, Work Measurement, published in 1952 by 
Columbia University Press. Its raw material comes from an intensive empirical 
investigation of work performance in two plants manufacturing ladies’ garments. It 
would have been instructive to compare the performance characteristics described 
by the author with others obtained in a different technological and managerial en- 
vironment. Such comparison might also throw some light on the relative impact of 
variations in individual efficiency as compared with technological changes or mana- 
gerial reorganization, under various operating conditions. In turn, this would help 
us to evaluate the industrial importance of Abruzzi’s approach to work measurement. 
The final part of the book is called “The Theory of Human Work.” In the opinion 
of this reviewer, the author’s theoretical discussion in this area is still largely oriented 
toward the problems of work measurement, and lacks the broad generality suggested 
by the section title. In an intriguing final chapter, called “Postscript and Salutation,” 
Abruzzi discusses, among other topics, the changes in the character of work brought 
abcut by the spread of automatic production methods. He anticipates that the trivial, 
repetitive—and measurable—part of labor input will be increasingly delegated to 
“mechanical workers,” that is, to machines. And increasingly, the human contribution 
will consist of nonrepetitive actions and creative judgments. These conditions will 
make the evaluation of work efficiency through the use of relationships between out- 
put and input time patently unrealistic. Eventually, they will also make obsolete the 
approach to work evaluation with which this volume is primarily concerned. 


Statistics of Labor Management Relations, Proceedings of a Conference Held at Asilomar, 
May 12-13, 1955. Berkeley: University of California, 1956. Pp. xii, 132. $0.50. Paper. 


Tueresa R. Suaprro, Bureau of Applied Social Research, Columbia University 


u1s book brings together 16 short papers delivered at a conference held at Asilo- 

mar, Pacific Grove, California, under the joint sponsorship of the University of 
California’s Institute of Industrial Relations and the Pacific Coast Social Science 
Research Council. 

The papers are grouped under the following headings: statistics of union member- 
ship, analysis of the provisions of union contracts, the statistics of health and welfare 
programs, statistical problems in measuring employer expenditure for wage supple- 
ments, and work stoppage and mediation statistics. 

These are all areas in which collecting the information and summarizing it presents 
serious difficulties, and most of the papers in this volume describe how these problems 
have been met and dealt with by either the California Department of Industrial 
Relations or the United States Bureau of Labor Statistics. The concern is with pro- 
cedure and classification, rather than statistical method or for that matter statistical 
analysis. In many ways, the problems are similar to those of the census taker—to 
cover the entire universe, snd to classify the data in such a way as to make summari- 
zation possible and to permit comparison from one period to another. But the 
material, particularly the provisions of union contracts, is much more variegated 
than anything the census undertakes to collect. For example, the medical benefits of 
health and welfare programs are reported to fill two punch cards. 

In a sense, this little volume is a tribute to the patience, persistence, and pro- 
fessional devotion of its contributors, many of whom are pioneers in the statistical 
techniques they describe. Because, however, the techniques discussed here are those 
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of procedure and classification, the book will have interest primarily for labor statis- 
ticians struggling with the same problems and for the consumers of the data— 
employers, unions, and students of industrial relations. 


Sample Survey of Labour Force in Rangoon: A Study in Methods. J. C. Koop. Rangoon, 
Union of Burma: Superintendent, Union Government Prinving and Stationery, 1956. 
Pp. 101. 1.50 Kyats=32 cents. 


Patuipe J. McCarruy, Cornell University 


~~ monograph describes a sample survey conducted in 1953 in the municipal area 
of Rangoon. The purpose of the survey was “to determine the size and the compo- 
sition of the labour force against the background of the total population according 
to age, sex, ‘race,’ employment status, industrial and occupational attachment.” 
Part I of the report gives a general description of the survey and presents tables that 
show the population and labor force cross-classified according to various character- 
istics. Part II contains a number of technical notes on methodology. This report 
makes an excellent addition to the growing body of literature which provides careful 
descriptions of well-planned and executed sample surveys. 

Included in the ;eneral descriptive material of Part I is one topic which should be 
of special interest to individuals concerned with labor force measurement. The survey 
started with the labor force definitions in current use in Canada and the United States 
and then modified these to fit the peculiar circumstances in Rangoon. Thus an un- 
employed worker is defined as one who is “without work and willing to work” instead 
of one who is “without work and actively searching for work.” The reasons for this 
and other changes are carefully explained. 

The technical material in Part II relates to a stratified two-stage sample design 
where the primary sampling units are drawn with equal probability, some fixed num- 
ber of elements is chosen from each PSU in the sample, and an unbiased estimate of 
a population total is desired. Major emphasis is devoted to obtaining exact expressions 
for optimum design. Although this is a standard problem, if, from each PSU in a 
stratum, one will take the same number or the same fraction of elements and if one 
assumes a constant cost per element, the author attempts to develop optimum theory 
without these restrictions. This is accomplished by associating a possibly different 
sample size and cost per element with the order of the draw instead of with a PSU 
itself. By averaging over all possible samples of some stated number of PSU’s and 
over all permutations of a fixed set of sample PSU’s, the desired exact expressions for 
optimum design are obtained. This reviewer found the development very interesting, 
but is at somewhat of a loss to see how in a practical situation one would be able 
to arrive at differential costs per element which depend on the order of the draw and 
not upon the particular PSU that is drawn. The actual sample used in the survey 
took a constant number of elements from each sample PSU. 

Also contained in Part II are some notes on optimum definition of the frame, punch 
card tabulation of linear estimates, an empirical law of variance, and non-sampling 
errors. 

Studies in the Quantity Theory of Money. Milion Friedman, Editor. Essays by Milton 
Friedman, Phillip Cagan, John J. Klein, Eugene M. Lerner, Richard T. Selden. The 
University of Chicago Press, 1956. Pp. vi, 265. $5.00. 


James W. ANGELL, Columbia University 


HIS stimulating volume represents the results of a reaction both against some of 
the alleged excesses of the Keynesian revolution, and against some of the more 
“mechanical” interpretations of monetary phenomena of the pre-Keynesian era. It 
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attempts to build a theory of the demand for money which can both be supported 
empirically and which can be put on all fours with demand theory as a whole. The 
“demand” for money is taken at most points to be the demand for real balances; to 
be governed primarily, within given frameworks of individual tastes, incomes and 
the like, by some sort of opportunity cost, and to be measurable in terms of the 
reciprocal of some index of velocity. The book is hence largely concerned with the 
determinants of velocity, the supply of money and its changes being taken as data. 
It is only indirectly an investigation of “the quantity theory” in the older and more 
familiar sense. 

Cagan’s paper involves extensive manipulation of differential equations and an 
ingenious use of exponentially weighted averages, while Selden’s depends largely on 
partial correlations and multivariate regression equations, but the other two empirical 
studies make little use of statistical tools, perhaps chiefly because of inadequate data. 

Of the five essays, the middle three are concerned with episodes in the history of 
inflation. Lerner’s short study of the American Confederacy shows that, as in most 
severe inflations, prices at first rose less than but later much more tnan the money 
stock, so that the real value of the stock fell by more than half: velocity more than 
doubled. The brief but sharp reversal produced by the forced contraction of the 
money stock in 1864 illustrates strikingly the effects of changes in expectations. 
Klein’s study of Germany in 1932-44 is concerned with a different phenomenon, 
repressed inflation. Here the real value of total cash balances doubled or more. The 
proximate explanation was the heavy decline in velocity, itself explained largely by 
the at least partial effectiveness of the Nazi controls over prices, production and fiscal 
policy (and in 1943-44 by transportation difficulties). This is all fairly familiar in the 
broad, but the detail and the procedures for correcting the crude data are useful 
contributions. 

Cagan’s study of seven hyperinflations (defined as more than 50 per cent increase in 
prices per month) is at once the longest and most complex of the essays, and tech- 
nically the most skilled. His principal findings are that in each case M and P both 
rose at generally increasing though far from uniform rates, with P usually far ahead; 
that the real value of total cash balances hence fell heavily, but fluctuated widely 
and never approached zero; that the changes in real balances were chiefly due to 
changes in the expected cost of holding them, as measured by expected price changes; 
that the latter expectations in turn were presumably based on past rates of price change, 
so that large changes in real balances followed those in prices only with a lag (other- 
wise the inflations would have been almost instantaneously explosive); and that the 
whole process of hyperinflation was due primarily to government action on the 
money stock—action itself explained by the fact that this was the easiest or perhaps 
the only major way to raise revenue. The hyperinflations were not due to “cost-price” 
spirals or to any great influence, unless at the beginning, from foreign exchange-rate 
depreciation, much opinion to the contrary notwithstanding. The lag apparently 
shortens, and the percentage allowance for future price rises becomes greater, as the 
past rate of price increase becomes larger. The most striking technical feature of the 
study is the development and testing of exponentially weighted averages of past 
price changes, for use in forecasting real balances held (the results are quite impres- 
sive, but one wonders if a much simpler if somewhat arbitrary system of weights, 
decreasing into the past, would not have been almost as effective!). The conclusions 
as to the responsibility of governments, and as to the dependence of the demand for 
money (in these cases at least) directly on its opportunity cost in terms of expected 
price changes, are of great importance. 
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Selden’s excellent paper works more familiar ground, but with great skill and ob- 
jectivity. He is concerned with the demand for money in the United States over the 
past century or more. “Money” is defined to include time deposits. This definition 
improves the smoothness of the velocity series, but at the cost of blurring over im- 
portant changes (or so I think; I will not argue the point here). Selden finds a fairly 
steady secular decline in velocity since 1839, but that this decline was large enough 
to be significant seems to me most debatable: (1) The official and other published 
data on State bank deposits in the nineteenth century, especially before the Civil War, 
are notoriously and even fantastically incomplete; a partial study I made some years 
ago indicated that only a fraction of the true totals had been reported; (2) from 1899 
to 1919, the apparent straight-line “trend” of velocity was virtually horizontal; 
(3) the same thing was true (admittedly at a lower level) from 1921 to 1930; (4) from 
1932 to date, the “trend” was definitely rising, with 1944-46 the only large interrup- 
tion (Selden’s figures stop in 1951, but the general rise has continued since then, back 
to roughly the 1899 levels). At least for the present century, this looks more like 
“long waves,” or perhaps “long steps,” than like a persistent secular decline. Short- 
term fluctuations in velocity, on the other hand, were positively correlated with the 
business cycle—as is familiar. ‘ 

With respect to the explanation of these changes in velocity, Selden finds—a 
striking conclusion—that they were not significantly correlated with the cost of 
holding money as measured by bond yields. Rather, the secular changes were related 
to the secular increase in per capita real income (but see the comment above, on the 
reality of the alleged secular changes in velocity); the short-run changes, to changes 
in “tastes,” disproportions between different measures of velocity, and particularly 
the cost of money substitutes—defined as bond yields minus short yields (pp. 208-9, 
212; this definition can perhaps be challenged). These conclusions are obtained from 
a series of correlation analyses yielding reasonably high and significant coefficients. 
One could only wish that the computations, which stop in 1951, had been brought 
up to date. 

The first essay in the volume, Friedman’s , undertakes both to tie these other rather 
dissimilar papers together, and to provide a general rationale for them—in effect, a 
restatement of the quantity theory as Friedman interprets the latter notion. It seems 
to me, however, that although the essay is subtle, penetrating in many details, and 
certainly provocative of thought, it falls short of achieving either of these goals con- 
vincingly. What Friedman is aiming at is a theory of the demand for money; changes 
in its supply are taken as essentially a datum. He argues that this demand is “stable,” 
meaning by that a stable functional relation (p. 17). But here the trouble begins. 
Stable in relation to what? Friedman’s own equations are extremely interesting, but 
are so general in form as to be not much more than a listing of categories. These 
categories include market interest rates, prices of assets, the rate of change of prices 
(all three of these being treated in terms of opportunity costs), real income, an income- 
wealth ratio, and a catch-all for other “utility”-determining variables. I do not find 
the injection of the permanent-income hypothesis, however, or the discussion of non- 
human wealth, particularly helpful here. Nor is the incorporation of the equations 
into a general-equilibrium system explored. 

The upshot is a complex, many-variable statement in a priori terms of the de- 
terminants of the demand for money. This is all right in itself, but the empirical 
studies seem to show certain inconsistencies. Thus Selden’s results focus primarily 
on the cost of money substitutes, as measured by the difference between long and 
short interest rates, and secondarily on per capita income and on “tastes,” which 





602 AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1957 


Friedman lumps together with divers other factors; but Selden throws out the cost 
of holding money as measured directly by interest rates, which Friedman stresses. 
Cagan throws out almost everything but the expected rate of price change—which 
is, of course, a measure of one type of cost of holding money. Klein is compelled to 
explain his striking discrepancies chiefly in terms of non-monetary controls; and 
Lerner’s results mainly support Cagan. In each case except Klein’s, some type of cost 
of holding money was the dominant factor, but the measures and modes of operation 
of this cost differed under differing circumstances. At ieast to my mind, Friedman 
never really reconciles these apparent conflicts in the empirical findings. 

Moreover, there are many other factors besides those listed which might plausibly 
be thought of as affecting the peace-time demand for money in important degree and 
which one would have liked to see Friedman discuss, if only to reject. They include 
such things as the general state of business activity, inventory fluctuations, the vol- 
ume of capital outlays, and the general state of expectatioas about the future. And 
one would have liked to see a companion study of the determinants of the supply 
of money, since a number must be the same as those af!ucting demand. Finally, there 
is little attempt to investigate short-run relations in peace-time, and hence to com- 
bine monetary and business-cycle analysis. 

Taking the volume as a whole, therefore, I think that on the one hand it does not 
do much to answer the “practical” questions of policy-makers in peace-time, questions 
which of necessity are largely short-run; and on the other hand, I also think that its 
authors have nct as yet provided a comprehensive, integrated, and usable statement 
that can be called “the” theory of the demand for money, even in longer-run terms. 
There are still too many apparently unresolved internal difficulties and perhaps too 
many missing pieces, and the form of statement is as yet too general to be more than 
highly suggestive. 

But one should not demand the millennium from a single set of exploratory studies. 
This volume is one of the most important coutributions to monetary theory to be 
made in many years. It breaks much new ground, and it carries the general develop- 
ment of the theory a long and valuable step forward. 


My Father Irving Fisher. Irving N. Fisher. New York: Comet Press, 1956. Pp. xv, 352. 
$4.50. 


BENJAMIN J. KLEBANER, City College of New York 


HE author of twenty-nine books, including Mathematical Investigations in the 

Theory of Value and Prices, The Theory of Interest, and The Making of Index 
Numbers, did not take the time during eighty years of an active life to write his pro- 
jected autobiography, unlike his contemporaries Richard T. Ely, John R. Commons, 
and Alvin Johnson. A major portion of this book, however, is in Fisher’s own words, 
as his son has drawn extensively from letters, lecture notes, diaries, in addition to 
excerpts from published writings. 

“I want to be a great man,” Fisher wrote to his wife in his thirty-eighth year. His 
position in the front rank of American economists is assured. Coming to the subject 
from a mathematical training, it was natural for him to pioneer in the field of mathe- 
matical economics. His restless disposition, however, refused to be confined to aca- - 
demic speculation. An 1895 statement of Fisher’s that he did not have many political 
opinions was soon made obsolete. He made it a point to attend presidential nominating 
conventions and was usually able to secure an interview with the nominee which was 
an occasion for giving advice. Willkie, he concluded after one such interview, was 
superior to either of the Roosevelts or Wilson. 
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An early bout with tuberculosis inspired a life-long interest in hygiene (How to 
Live, written in collaboration with a physician, sold almost half a million copies in 
twenty-one editions), as well as che promotion of the Life Extension Institute and 
the eugenics movement. Belief in prohibition and the virtues of acidophilus milk 
stemmed from the same source. Opposed to American imperialism and militarism 
since the 1890’s, he was active in seeking United States’ entry into the League of 
Nations. The most narrowly economic of his manifold activities was his crusade to 
stabilize the dollar. He even sought out Mussolini, hoping that the personal com- 
munication would serve to interest the dictator in stabilization through League 
action. 

These fields were not enough. As a freshman he applied for his first patent. Over 
the years he conceived, inter alia, a three-legged folding seat, a tent for tubercular 
convalescents, a world map which minimized distortion on a flat surface, and a visible 
card index. The index was the only commercial success. An impecunious scholar who 
had married into a wealthy textile family successful for a time in his own speculation, 
he left a net estate so small that no federal tax had to be paid. Fisher’s confidence in 
the prosperity of the 1920’s, which he attributed to appropriate Federal Reserve 
action, is well known. It is matched by the mistaken enthusiasm revealed in an April, 
1933, letter: “I am sure... that we are going to snap out of this depression fast,” 
thanks to New Deal monetary measures. Plainly the greatness of the economist- 
statistician does not rest on the accuracy of his prognostications. 

Fisher was connected with Yale as student and professor for over half a century. 
This reader would have welcomed much more discussion than the author vouchsafes 
about what was happening at the university during these eventful years, as viewed 
by one of the outstanding faculty members. In a 1925 letter he refers to his profession 
as one “which scarcely pays a living wage.” We learn little, however, about Fisher 
as a professor. A tantalizing reference to a blacklisting of alleged subversives (of 
whom Fisher had the honor to be one) by the D.A.R. does not give sufficient details 
to satisfy the curious. Nowhere are we told of Fisher’s reaction to Keynes’ General 
Theory or, for that matter, his estimation of Marshall’s Principles. The very extensive 
quotations from Schumpeter’s readily accessible memorial article (Econometrica, 
1948) and such anecdotes as those relating to the tribulations of a ride in an early 
auto (pp. 173-74) and the difficulties of having an English manuscript typed in 
Paris (pp. 226-28) might well have been replaced by more revealing personal insights 
into the man and his world. 


Foreign Trade and Industrial Development of China—-An Historical and Integrated 
Analysis through 1948. Yu- Kwei Cheng. Published under the auspices of The American 
University and The China International Foundation by The University Press of Washing- 
ton, D. C., 1956. Pps. xi, 278. $7.00. 


Jerome B, Couen, City College of New York 


HE purpose of this careful and thoroughly documented scholarly study is to show 

the historical development and interrelationship, in integrated fashion, of the 
most dynamic sectors—trade and industry—of China’s static and agricultural econ- 
omy. The time range is from the Opium War of the 1840’s through 1948, the eve of 
the inception of the communist regime. 

Emphasis for the prewar decades is upon the importance of institutional arrange- 
ments, the trend analysis of trade composition in the wake of industrial development, 
the interrelationship between an import surplus and foreign investments in China, 
and the general effects of tariff and silver price fluctuations on China’s industry and 
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total economy. In the war and post-war period, 1937-1948, the development and de- 
gree of Chinese inflation are discussed stage by stage and their effects on industry and 
trade have been assayed. The economic conditions of occupied China and the rapid 
industrialization of Manchuria under Japanese rule, 1932-1945, are investigated, 
analyzed and evaluated. The highlights of the historical studies are summarized in 
the next-to-the-last chapter. The final chapter deals with theoretic considerations of 
problems confronting the industrialization of underdeveloped countries in general 
and with the long-range prospects of Chinese industrialization in particular. 

In presenting the historical facts and analysis in a period of modern Chinese 
history laden with foreign domination, wartime strains, post-war economic crises and 
civil strife, the author, senior economist of the Institute of Social Sciences, Academia 
Sinica, and Deputy Director, Institute of Economic Research, National Resources 
Commission, Nanking, China, has made every effort to be dispassionate, objective, 
and nonpartisan. While the study is the direct result of four years of intensive re- 
search dene at the Brookings Institution, the author has drawn from his former 
studies done at the Academia Sinica and the reference materials gathered by the 
National Resources Commission of the Republic of China, organizations with which 
he was associated for nearly two decades. 

From hog bristles to hyper-inflation, from Co-Hong to Chiang, from Kowloon to 
Keynes, this volume blends a vast intermixture of Asian and Western sources and 
statistics. Appendix tables in Haikwan taels and U. 8. dollars emphasize the bridging 
of the gap fér which the author is so eminently qualified. 

It is a sad history which Yu-Kwei Cheng relates. After noting that “China was 
on the silver standard up to November 1935. Yet she produced virtually no silver 
and exercised no influence in its price fluctuations,” the author declares: 

“In short, in modern Chinese economic history up to 1935, the fluctuation of the 
world price of silver called the tune of Chinese economy which was greatly aggravated 
or boomed through the movement of silver, decrease or increase of foreign investment, 
fall or rise in commodity trade, remittances from overseas Chinese, and various re- 
lated real-estate and financial speculations. Foreign entrepreneurs and bankers in 
China, because of their dominant commercial, industrial and financial interests, 
were responsible for most of these major economic actions taken under the protection 
of treaty-privileges and extra-territorial rights, without the knowledge and free from 
the control, of Chinese authorities ...” (p. 217). 

“China was freed from treaty bondage, first by circumstances and then by abroga- 
tion, and from foreign economic domination. Bui she was now confronted by new 
and equally formidable economic problems in her industrial and commercial develop- 
ment—blockade and inflation.” 

Harassed and tortured during World War II, “on V-J Day China emerged as an 
independent and victorious nation. But the three years of post-war hyper-inflation, 
after a brief initial break, brought her to total economic ruin and collapse” (p. 223). 

If there is any criticism at all of this volume, it is that Western specialists reading 
it will be somewhat disappointed by its relatively heavy leaning on well-known 
secondary Western sources, such as Remer’s Foreign Investments in China, Jones’ 
Manchuria Since 1931, etc., and too few, hoped for, original Chinese sources, which 
an avid reader might have expected from so learned a Chinese scholar. 
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Industrial Location in the New York Area. John I. Grijfin. The City College New York 
Area Research Council, Monograph 1. New York: The City College Press, 1956. Pp. xiii,’ 
212. $5.00. 


Joun A. Howarp, University of Chicago 


HE industrial growth of certain geographic areas in the postwar period has some- 

times created fear in less rapidly growing communities that migration of industry 
may be contributing to this growth. Professor Griffin devotes himself to the problem 
of the emigration of manufacturing industry from a 15-county New York area which 
includes New York City. Although his objectives are stated more broadly (p. 1), he 
addresses himself essentially to three questions, the magnitude of New York City’s 
problem, its causes, and recommended action. 

A comparison of the rates of growth of manufacturing employment between 1900 
and 1954 in the 15-county area indicates that New York City has experienced less 
growth in manufacturing than any of the surrounding counties except one. Other 
more specific evidence is adduced to suggest that New York City does have a problem 
of industrial emigration, e.g., during the period 1946 to 1954, 100 iirms, each employ- 
ing 100 or more employees, moved from New York City. 

A mail questionnaire to manufacturing establishments in the 15-county area is the 
main source of evidence of the causes of emigration. From the survey an index of 
the relative importance of each locational factor in a county and of the relative 
locational desirability of each county is computed. Although no conclusions are drawn 
as to what these attitudinal data suggest about the relative locational desirability 
of New York City as a whole, the author does point out that among the five counties 
making up New York City there is almost as much variation as among all fifteen 
counties. 

The author recommends that something be done about the problem: 


“Public action in the case of New York City should mean a clear-cut acknowledg- 
ment that manufacturing has been the economic base of the city and that every 
effort must be expended to preserve it. This means not only industrial promotion 
through a public agency but specific action to preserve existing industrial space and 
to create new space safe from the encroachments of housing developments.” (p. 4) 


The book is vulnerable to serious criticism. The survey of attitudes toward loca- 
tional factors will be evaluated first. The major limitation of the survey is that a 
mail questionnaire is not a suitable instrument for obtaining this kind of information. 
The ambiguity of such data even when gathered by extensive personal interviewing 
is illustrated repeatedly in another study of industrial mobility.1 When, for example, 
a manager replies that “wages” is an “important” locational factor in his area he can 
mean (1) a certain minimum level of wages is necessary for him to operate or (2) he 
can mean that this minimum is so low that it is found in only a few locations. Only 
in the second sense is it economically relevant. Terminology is ambiguous in another 
way. “Municipal taxes” can be an unfavorable factor either because of the level of 
taxes or the way in which they are levied. 

Turning to the questionnaire used in the study, it is not clear whether the author 
is concerned with obtaining a respondent’s views as to the factors that specifically 
would affect his decision to locate or factors that affect location generally. The cover- 
ing letter implies the former. We are not told who in an establishment completed the 
questionnaire, so that the reader cannot judge the reliability of the information. It is 
doubtful whether the plant managers of multi-plant companies would have thought 





1 George Katona and James N. Morgan, ‘The quantitative study of factors determining business decisions,” 
Quarterly Journal of Economics, Vol, 66 (1952), 67-82. 
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about the factors affecting their location. Even if they had, they probably would know 
little about the relative importance of the factors because much of the relevant data 
is probably maintained at the home office where the decision would be made. 

The universe was taken to be the establishments currently located in the study 
area. The omission of establishments that had recently moved from the area would 
suggest that an important source of information, at least about unfavorable attitudes, 
was lost. Also, establishments with less than 100 employees were omitted and these 
constituted approximately 95% of the total number in the area, though only 50% 
of the total employment. 

The limited response of 25% may have been an important source of bias, and 
although personal interviews were conducted, it is not made clear whether their pur- 
pose was to evaluate the extent of this bias. Griffin explains that time did not permit 
a follow-up letter. 

A iinal criticism of the survey has to do with the index of the relative importance 
of each of the locational factors. Cardinal values were assigned to a respondent’s 
ranking of the factors. The values, for example, were +50 for the most important 
favorable factor, +40 for the next most important favorable factor, —50 for the 
most important unfavorable factor, —40 for the next most important unfavorable 
factor, ete. Only five favorable and five unfavorable factors were included in the 
index. The sum of these individual respondent scores for a given factor were divided 
by just the number of respondents who cited this factor rather than by all of the 
respondents from that county. These scores were then ranked to yield the relative 
importance of each of the factors in a given county. The system gives an undue weight 
to a few respondents who felt strongly about a given factor. More fundamentally, a 
respondent’s ranking of factors reveals little about the absolute intensity of his feeling 
and therefore about the influence upon his decision to re-locate. 

Aside from the survey of attitudes toward location, one of the most serious limita- 
tions is the failure to formulate hypotheses explicitly. Griffin shows, for example, the 
distribution of both the universe and the respondents by Major Industry Group 
(Standard Industrial Classification) which would suggest that he felt the type of 
industry might be important. He does not say this, however, nor does he investigate 
industry differences among the factors listed in the survey. There are many illustra- 
tions of the failure to use data fully. A related defect is the failure to integrate the 
findings of a large number of studies into the results of his survey and into the body 
of the report generally. 

Minor criticisms can be made, also. The author uses the terms “firm” and “estab- 
lishment” interchangeably. Apparently he is consistent in meaning “establishment.” 
The addition of a summary at the end of each chapter would have contributed to the 
reader’s comprehension. Approximately one-third of the publication consists of full 
pages of data in tabular form. Much of the data could be relegated to an appendix, 
particularly since the expected readers are presumably laymen. Charts are used in 
some instances to summarize the tabular data. 

Finally, the typography could be improved. It is disconcerting to find on p. 16 
references to Charts IV and V with no mention of chart titles and to discover these 
charts seven and nine pages later with titles but unnumbered. Again, a table may 
run for several pages but the table number is not repeated on succeeding pages. 
County names are used consistently in discussing New York City, but the reader who 
is not familiar with the area and refers to a map on either of the inside covers finds 
two of the counties identified only by their municipal names—New York County and 
King’s County are labelled, respectively, Manhattan and Brooklyn. 
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In summary, the survey information is of doubtful reliability, and a considerable 
amount of other information is unused. The relation between evidence and conclusions 
is often not made explicit. More extensive public policy recommendations should 
have been possible. 

Griffin does succeed in placing the magnitude of New York City’s industrial emi- 
gration problem in better perspective. He must have exercised painstaking care in 
bringing together the great number of studies bearing on location in the New York 
area. An appended bibliographical essay and bibliography suggest that he is familiar 
with the more general location literature as well. His county-by-county descriptions 
may be useful to others studying industrial location in this area, especially to manage- 
ments contemplating location. His list of establishments with 100 or more employees 
may be helpful to researchers, though it wili rapidly become outdated. 


Modern Experiments in Telepathy. Second Edition. S. G. Soal and Frederick Bateman. 
New Haven, Conn.: Yale University Press, 1954. Pp. xv, 425. $5.00. 


Ray Hyman, Harvard University 


wo minds can communicate with each other, according to Modern Experiments in 

Telepathy, without the intervention of normal sensory channels. Such communica- 
tion seems to be just as effective when 500 miles, rather than the usual dozen feet, 
separate agent and percipient. And even more intriguing is the claim that the per- 
cipient can be aware of a thought before the agent has concentrated upon it. 

Confronted with such claims, the sceptical scientist is apt to ask himself, “Is extra- 
sensory perception fact or artifact? And if it is a fact what consequences, if any, does 
it foreshadow for me and my work?” 

Clearly, the scientist views the possibility of extra-sensory perception with mis- 
givings. The concepts of telepathy and precognition, especially with their historical 
origins in spiritualism and psychic investigation, suggest a return to the animism and 
magic from which modern science supposedly has freed itself. When a carefully 
documented work such as Modern Experiments in Telepathy, therefore, attempts to 
place extra-sensory perception on a scientific footing, he will scrutinize the argument 
for possible loopholes. 

Soal and Batcman base their conclusions upon more than 60,000 guesses obtained 
from two gifted subjects. These guesses were made in the following experimental 
situation. An experimenter, on each trial, relays a number to an agent. The agent, in 
turn, encodes this number into one of five symbols. Only the agent knows which 
symbol corresponds to which number, and he changes this code after every block of 25 
trials. The agent concentrates upon the appropriate symbol while the percipient, in 
an adjoining room, tries to guess which one it is. A typical experimental session con- 
sists of 400 such guesses. All told, the results of the Shackleton-Stewart series com- 
prise 170 sessions. 

The telepathy displayed by the two subjects was not the dramatic ‘‘mind reading” 
of folklore and conjuring. The communication between agent and subject was of a 
very rudimentary kind, being detectable only in the form of a slight, but statistically 
significant, excess of correct guesses. Basil Shackleton consistently guessed better 
than expectation, over a period of 2} years, with odds against chance quoted at 10* to 1. 
This was true, however, not when his guess was matched against the target symbol, 
but only when it was matched against the next symbol in the series. Mrs. Stewart, 
over a period of 4 years, guessed the target symbol sufficiently often to make the 
odds against chance 107° to 1. 
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Soal and Bateman, anticipating negative reactions to their work, have fortified 
these odds with sophisticated arguments and evidence to the effect that their positive 
results are nos due to recording errors, optional stopping, improper selection of data, 
ad hoe tests of hypotheses, wrong statistical model, inadequate randomization of 
target symbols, or deliberate signalling between agent and percipient. 

In addition to these painstaking efforts to avoid statistical and methodological 
pitfalls, the book raises many other issues to frustrate hasty attempts to “explain 
away” the findings as artifact. The features that make the Soal and Bateman argu- 
ment one of the most formidable in the annals of psychic research, in this reviewer’s 
opinion, can be summarized under four points. 

1. Consistency of scoring rate. Despite changes in personnel, procedures, symbols, 
locale, ete., the scoring rates remained remarkably stable from session to session for 
both Basil Shackleton and Mrs. Stewart. 

2. Empirical checks on the adequacy of randomization procedures. Soal and Bateman 
systematically applied the following cross check to each block of 50 target stimuli and 
guesses. The first 25 guesses were matched against the last 25 target symbols, and 
the last 25 guesses were matched against the first 25 target symbols. This cross check 
yielded results in accord with the probability model. A cross check on 33,500 guesses 
made by Mrs. Stewart, for exainple, registered 6,711 correct hits as against a binomial 
expectation of 6,700. When the same guesses were matched against their intended 
targets, the actual number of correct guesses was 8,594 hits or 27 standard deviations 
above chance. 

3. The difference between telepathic and interspersed clairvoyant trials. During some 
sittings the authors alternated blocks of telepathic trials (in which the agent looks at 
the designated symbol) with clairyoyant trials (in which the agent does not look at 
the designated symbol). Both Shackleton and Mrs. Stewart maintained their signifi- 
cantly positive scoring rates on the telepathic trials while guessing no better than 
chance on the interspersed clairvoyant trials. 

4. The long distance experiment with Mrs. Stewart. A series of 6 sessions, in which 
Mrs. Stewart was separated from the agent by 500 miles, was very successful with 
results comparable to those obtained at regulation distance. 

How do such results stand up to the criticisms of sceptics? This reviewer was able 
to classify all the major attempts to dismiss the findings of Soal and Bateman under 
one of three headings: possibility of sensory cues, defects in the probability model, or 
an incompatibility of extra-sensory perception with the presuppositions of modern 
science. As we shall see, none of these criticisms, with the exception of the imputation 
of fraud to the investigators and subjects, succeeds in finding a plausible alternative 
that encompasses all four points. 

Possibility of sensory cues. One obvious place to seek a flaw is in the possibility of 
actual sensory communication between agent and percipient. The “‘double-whisper- 
ing” theory of D. H. Rawcliffe, which Soal and Bateman brand as “‘preposterous,” 
invokes involuntary whispering and hypersensitive hearing as the links between the 
agent and the experimenter, on one hand, and the agent and the percipient on the 
other hand. Although Raweliffe’s suggestion is needlessly complex, it is obviously 
less “preposterous” to most scientists than is the alternative of extra-sensory per- 
ception. 

When we consider the conditions under which most of the sittings were conducted 
(in the percipient’s own home, inadequate separation of experimenter from agent and 
percipient, subjectively timed trials, vocal announcements of trial numbers, etc.), a 
simple explanation in terms of subliminal sensory cues does seem quite reasonable. 
Communication between agent and experimenter, who were seated at the same table 
with only half a screen separating them, in the form of sensory cues transmitting 
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partial or complete information about the code is well within the bounds of possibility. 
And communication between the experimenter, as he vocally announces each trial 
number, and the percipient is even more plausible. 

But this hypothesis runs into serious difficulty when it encounters the successful 
series in which Mrs. Stewart was separated from the agent by the English Channel. 
Unless a more detailed description of these 6 sittings uncovers unsuspected loopholes, 
Seal and Bateman can righteously ignore the accusation of actual sensory contact 
between agent and percipient. This successful long distance series is indeed fortunate 
for the argument of Soal and Bateman. It is only because of the outcome of this 
series, in this reviewer’s opinicn, that the entire set of 170 sittings becomes plausible 
evidence for non-sensory communication. 

Defects in the probability model. Almost as curious as the phenomenon of extra- 
sensory perception, itself, is the proposition that the deviations from chance imply a 
bias that results when the probability model is used in conjunction with published 
tables of random numbers (Brown in Nature, 25 July 1953; Boring in American 
Scientist, January 1955; Bridgman in Science, 6 January 1956). In essence, the argu- 
ment implies that Soal and Bateman make all their comparisons against an a priori 
baseline and that this baseline may be too low because of biases in randomization 
procedures. 

Just how such a suggestion could arise in connection with the work in Modern Ex- 
periments in Telepathy is puzzling, indeed. For one of the outstanding virtues of this 
work is the empirical cross check on the randomization procedures. Both the cross 
check and the results of the clairvoyance trials yielded empirical numbers of hits that 
agreed well with the probability model. And the number of correct guesses by Shackle- 
ton and Mrs. Stewart were far in excess of the empirical as well as the a priori base- 
lines. 

It certainly must be a peculiar bias, as Soal and Bateman point out, that operates 
only on telepathic calls and suspends activity on interspersed clairvoyant calls and 
on empirical cross checks. 

Incompatibility of extra-sensory perception with the basic presuppositions of modern 
science. Some scientists view extra-sensory perception as incompatible with science. 
And the parapsychologists, themselves, have furthered this impression by holding 
up their findings as arguments against the “materialism,” “mechanism,” and “phys- 
icalism” of modern science. Further, this incompatibility has been seized upon as 
sufficient grounds for dismissing the evidence for extra-sensory perception (Price, 
Science, 26 August 1955). 

Such an approach shifts the emphasis away from the empirical evidence and brings 

us face to face with complex psychological and philosophical questions. One question 
is how fundamental is the causality concept to science. And just where one stands 
on this issue—the role of causality in science; whether it be invariant temporal se- 
quence, functional relation, statistical regularity; whether it need imply spatio- 
temporal contiguity; ete.—depends upon which philosopher of science one prefers to 
read. 
There are other complex ramifications of such a position. Are these basic principles 
of science of such a universality that we must assume them to hold even in areas not 
previously explored by science? And, to get back to the empirical plane, how de wo 
know that extra-sensory perception violates such principles? The answer to the latter 
question depends partly upon definitions and partly upon the accumulation of more 
data than is currently available. 

It seems to this reviewer that the attempt to deny extra-sensory perception the 
status of “fact’’ on purely a priori grounds exaggerates the universality and status 
of the basic presuppositions concerning space, time, and causality. And, as a corollary, 
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it enhances the significance of parapsychological findings beyond anything warranted 
by the paucity and crudity of the data. 

Rather than argue the question at a philosophical level—a level at which there is 
little agreement concerning definitions and ideals—Soal and Bateman’s data, like 
other scientific evidence, should be evaluated on their empirical merits. And at this 
level, it seems, most critics are willing to concede that the findings were obtained 
according to the same rules of procedure we use in orthodox inquiries. 

The real question, then, is not whether the phenomenon (or phenomena) uncovered 
by Soal and Bateman is a “fact”—for by current empirical standards it is a fact— 
but what kind of a fact is it? What significance does this “fact” have for the rest of 
science? 

The phenomenon called extra-sensory perception does seem to be a peculiar fact. 
Among other things the evidence suggests that it is elusive and not repeatable by 
prescription; it is rare; it is beyond conscious control; it is unpredictable; it is detect- 
able onty by subtle and indirect means; and it is shy in the presence of scepticism. 
Just how characteristic such features are is a moot question. Soal and Bateman’s 
data more than adequately demonstrate the existence of the phenomenon; but too 
many things were varied simultaneously from session to session to enable one to 
isolate factors that correlate with scoring rate. 

For all practical purposes, then, extra-sensory perception behaves very much like 
what the scientist calls errors of measurement. In his quest for repeatable, systematic 
relationships the scientist lumps all factors that ‘“‘cause” “random” fluctuations in 
his measurements from experiment to experiment under the broad category of vari- 
able errors. Extra-sensory perception, even if it were to enter into the scientist's 
measurements, would do so as a random error. From what we know about the pheno- 
menon, it would certainly not produce any long-range bias. 

What about the theoretical status of this phenomenon? As far as this reviewer can 
see (and, judging from published reactions to this book, there are many who would 
disagree) the walls of science will not come tumbling down if we admit the phenom- 
enon to be a fact. For, at the moment, it is nothing but an isolated fact. It fits into 
no scientific scheme; it can be deduced from no current theoretical system. With 
more evidence concerning the antecedent conditions and correlated factors, it may 
persist in remaining an isolated fact—one with little significance beyond itself. Or 
it may force a duality upon us concerning human behavior and physics; or it may 
force us to revise our basic premises concerning science; or it may—after all the heated 
controversy—turn out to be something that is compatible and consistent with our 
present presuppositions about the physical world. 

It is too soon to judge. We need many more iacts than the parapsychologists have 
supplied us. And judging from the difficulties that lie ahead for them, it will be a long 
time before we will be in a position to judge. 

In some ways the present situation in parapsychology reminds one of the ‘‘ex- 
planatory crisis’ that Bridgman saw posed by relativity and quantum phenomena 
(Logic of Modern Physics, 1928). “Whenever experience takes us into new and un- 
familiar realms, we are to be at least prepared for a new crisis. . . . It seems to me 
that the only sensible course is to . . . wait until we have amassed so much experience 
of the new kind that it is perfectly familiar to us, and then to resume the process of 
explanation with elements from our new experience included in our list of axioms.” 

Is the phenomenon of extra-sensory perception a fact that we should be concerned 
about? The an 1 to this question must wait upon this amassing of “‘so much experi- 
ence of the n« \ id.” For, as Boring has put it, “Of its importance in the developing 
scientific skein, posterity will be able to judge, and you cannot hurry history.” 
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Social Characteristics of Urban and Rural Communities, 1950. Otts Dudley Duncan and 
Albert J. Reiss, Jr. New York: John Wiley & Sons, Inc., 1956. Pp. xviii, 421. $6.50. 


Rosert E. Weintravs, City College of New York 


Un Census data, Duncan and Reiss examine differences in population char- 
acteristics in United States communities in 1950. The basic assumption of the 
study “... is that population characteristics reflect the selective influence of different 
types of communities.” (p. xii) This assumption will arouse littie, if any, hostility, 
since it requires only that people consider “the whole of advantages and disadvan- 
tages’’ in selecting places of employment and residence and that units of productive 
power, including land and labor, are not fully interchangeable for purposes of pro- 
duction. 

The thesis that different types of communities attract different types of people 
serves as the point of departure. The hypothesis of the monograph is that the popula- 
tion types (age groups, income groups, etc.) that are attracted by different communi- 
ties (political entities) can be predicted in terms of four identifiable community 
features: (1) size; (2) spatial position with respect to central cities; (3) rate of popula- 
tion growth; and (4) functional specializations. 

The study is divided into four parts. Part I deals with the association between 
community size and population characteristics; Part II is concerned with the correla- 
tion between spatial position and population characteristics; Part III examines the 
association between population growth and economic opportunity; and Part IV 
treats the relationship between the dominant types of productive activity and popula- 
tion characteristics. The details of these analyses are numerous but the main results 
are summarized in Chapter II of Part I and at the end of each chapter in Parts II, III, 
and IV. 

The results of the comparative study of communities of different sizes are mixed. 
Some indexes of population characteristics are monotonic functions of size-of-place. 
For example, median income and median years of schooling completed are monotonic 
increasing functions and the index of fertility is monotonic decreasing. But other 
indexes (measures of labor force participation, age, etc.) do not increase or decrease 
regularly with size-of-place. Thus the data do not fully endorse the concept of a 
rural-urban continuum. Also it is noteworthy that “...in terms of population 
characteristics, there is no uniform, sharp break between rural and urban on the size- 
of-place scale.” (p. 39) : 

Duncan and Reiss did not expect to identify the selective influence of communities 
exclusively in terms of size-of-place. Spatial position, growth, and functional special- 
izations are used as classificatory factors in order to supplement the rural-urban con- 
tinuum thesis and the conventional rural-urban dichotomy. 

The major positive finding resulting from the analysis of spatial organization is 
that the populations of suburbs are unique. For example, median income and median 
years of schooling completed are found to be higher in suburbs than in central cities, 
whereas opposite results are found in comparing small and large cities. Also, median 
income and median years of schooling completed are found to be higher in suburbs 
than in independent cities of the same size. These and similar data are used as factual 
bases by Duncan and Reiss for concluding that the selective influence of suburbs 
is unique. But this inference is not entirely warranted. 

It is regrettable that the authors fail to follow-up their suggestion to separate 
residential from industrial suburbs. Doubtless the relationships observed for all 
suburbs are valid (and more striking) for residential suburbs. These communities 
offer better dormitory facilities than do either central cities or independent cities of 
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the same size. Better living conditions attract higher income groups, and therefore, 
since high income groups contain disproportionately large numbers of highly educated 
persons, etc., residential suburbs also tend to select highly educated persons, etc. On 
the other hand, it is doubtful if the relationships in question hold for industrial sub- 
urbs. Opportunities for earning income in these communities probably are less than 
in central cities and not significantly greater than in independent cities of the same 
size. Also, living conditions both in central cities and in independent cities of the same 
size probably are at least equal to those in industrial suburbs. 

The obvious differences, especially in living quarters, between residential and in- 
dustrial suburbs cast doubt on the argument that suburbs, per se, have a unique 
selective influence. A limited proposition might be substituted: that residential sub- 
urbs select high income and associated groups because they offer relatively attractive 
dormitory facilities. 

The study of communities of different population growth attempts to explore 
“ ., the relationship of economic opportunity to the growth rate of cities.” (p. 210) 
The results are inconclusive. This is a blessing in disguise because the analysis er- 
roneously assumes that people live and work in the same political entity. Needless 
to say, Many persons commute between place of residence and place of employment. 
Commuters often are persons who changed their place of residence when economic 
opportunities improved in the city of departure and thereby provided the means for 
acquiring better living quarters in the place of destination. Thus it is possible to 
observe little or negative population growth in cities in which economic opportunities 
are relatively great and improving, and conversely, high growth in cities in which 
opportunities for earning income are relatively small and tend to remain constant. 
The authors do not take this factor into account, and therefore their analysis of the 
relationship of economic opportunity to the rate of population growth of cities cannot 
be taken seriously. 

The final part of the monograph examines the relationship of population charac- 
teristics to functional specializations of cities. The attempt to identify the selective 
influence of communities in terms of functional specializations is perhaps the most 
important contribution of the study. But it must be pointed out that the results which 
Duncan and Reiss obtain (and they are mixed) cannot be regarded as ironclad. Only 
tentative conclusions can be reached because our economy’s production and consump- 
tion patterns are continually changing, and consequently some economic fields are 
growing and others declining in ability to bid for labor services. Thus communities 
specializing in productive activities that enable them to bid for certain population 
groups today might be outbid and lose these groups to other cities tomorrow, and 
conversely. It follows that relationships between differences in productive activities 
and other morphological differences in communities that are observed by Duncan 
and Reiss might be reversed in the future. 


Modern Market Research: A Guide for Business Executives. Max K. Adler. New York: 
Philosophical Library, Inc., 1957. Pp. ix, 158. $4.75. Printed in Great Britain for Philo- 
sophical Library by the Pitman Press, Bath. 


Leonarp Kent, Needham, Louis and Brorby, Inc. 


bey seventeen chapters of this small volume are devoted primarily to topics which, 
in this country, have become identified with the field of market research and its 
practitioners. A brief introductory statement of the nature and growth of market re- 
search serves to introduce such discussions as: applications to problems both internal 
and external to the business firm including sales analysis and forecasting, product 
research and consumer surveys; readership and copy testing aspects of advertising; 
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principles of sampling and sample design; questionnaire construction; interviewing; 
data processing; the writing of reports; the organization and structure of the private 
or commercial research organization; and, finally, the role of research in an eco- 
nomic and social system in which the goals appear to be shifting from efficient pro- 
duction to efficient consumption. The only departure from the orthodox subject mat- 
ter of market research, and the only feature likely to strike U. S. market researchers 
as “modern,” is the single chapter dealing with motivation research. Incidentally, 
this chapter contains the volume’s only citation of references. 

The author early identifies the specific audience for which he is writing as the 
people “who commission research.” They are business executives primarily engaged 
in manufacturing, wholesaling, or the service trades. For these businessmen, the 
book is intended as a guide for communication on equal terms with the research ex- 
pert and to evaluate research findings independently. Among the goals he sets for 
himself are answers to such questions as: What can the business executive expect in 
return for his expenditures on research? How much should he be willing to pay for it? 
Whom should he entrust with its execution? 

Business execu‘ives are, undisputedly, an important audience to which to address 
a clear, concise, and accurate statement of what market research is and what it can 
do. The questions asked are all vital considerations if research is to be used properly 
as a tool of management in narrowing the field of uncertainty in decision making. This 
reviewer subscribes wholeheartedly to the alliance between market research and the 
social sciences. Both fields employ the scientific method for the purpose of determining 
human behavior; and, certainly, statistical techniques and procedures are the most 
powerful instruments available to each. The point is also well taken, but nowhere 
else elaborated, that market research alone cannot solve a firm’s business problems; 
and further, that it in no way delimits the responsibility of the decision maker. Unt 
fortunately, after such a commendable start, and in light of the important audience 
he hopes to reach, the author’s treatment of the most basic procedures and techniques 
of market research contains a number of misleading, inaccurate, and downright er- 
roneous statements. A few of his conclusions can only be characterized as sheer non- 
sense. Neither the brevity of the volume nor the attempt to avoid technical language 
are justifiable excuses for not checking the «ccuracy of facts and the precision of 
statements. 

This reviewer will cite only a few of this volume’s more glaring and conspicuous 
errors. To begin with, U. S. market researchers will certainly take up the cudgel 
against such a claim as “Britain is the mother country of market research.” This 
statement was recently refuted by some of the author’s English colleagues in their 
publication, Readings in Market Research, A Selection of Papers by British Authors. 
(See review in this Journal, June, 1957, pp. 273-6.) America is generally conceded to 
be the home of “scientific marketing’ and the adherents to “the advancement of 
science in marketing” need only point to the founding of their professional society, 
the American Marketing Association, in 1937. The British Market Research Society, 
on the other hand, came into existence during the postwar year 1947. 

Here are a few examples of statements in the category of either nonsense or naivete: 
crediting the Literary Digest fiasco in the 1936 presidential elections in the U.S.A. 
with being “the starting point for modern sampling methods”; “approximately two 
thousand interviews are normal in a survey which concerns the entire population of 
the country,” regardless of the purpose or objectives of the investigation; research 
in the U. S. has conclusively shown that the highest returns are received from mail 
questionnaires that are reproduced on a light yellow paper; the implication that the 
Daniel Starch organization has made an exact science out of copy writing. More 
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significantly, the conclusion that Starch ‘‘has so perfected his methods that there is 
probably not one advertising agency in the U.S.A. or in Britain which does not rely 
on the Starch Index’’ will be greeted as perhaps the most explosive misstatement in 
the book. 

The four chapters devoted to sampling are filled with misconceptions and inaccu- 
racies. This reviewer would like to recommend tc any reader of the volume that he 
substitute for the entire discussion on sampling the little volume entitled Statistics 
by L. H. C. Tippett, first published by the Oxford University Press in 1943. The 
business executive who does not wish to become an expert at statistics will obtain, 
if he reads Tippett’s discussion of sampling, a clear, concise, and accurate exposition 
of the role of statistics in marketing research. The sources of confusion in Adler’s 
discussion of sampling are too numerous to cite in a short review. Suffice it to mention 
only a few. The presentation of the Law of Large Numbers is confused with the be- 
havior of random errors; there is no understanding that this law actually works 
through “swamping” rather than “compensatory” effects. Nowhere is it made clear 
that sampling results vary by chance and that the pattern of chance variation depends 
on the population. There is the naive conception that a sample must be a miniature 
replica of a population. In fact, it is stated that “market research is not valid unless 
the sample represents a true cross section of the population to be investigated.” This 
statement presupposes, then, that enough is known about a population to make 
sampling unnecessary. It is impossible to guarantee that any sampling method, ran- 
dom or other, will produce a “representative” sample or a “true cross section.” If 
probability methods are used, any kind of a sample is possible. There are precautions, 
such as stratification, which can be taken to reduce the probability of bad samples. 
It is impossible to ensure the selection of a sample that will be “representative” of a 
population with regard to characteristics not known in advance of sampling. One of 
the most confused expositions of the entire treatment of sampling is the relationship 
between size of sample and the parent population and what effect this in turn has 
upon the reliability of the sample. In some instances, the author writes as if “the 
margin of error’’ is dependent upon the proportion the sample is of the total popula- 
tion. Although opposite to what common sense usually dictates, one of the concepts 
basie to a correct understanding of sampling variability and of elementary statistical 
inference is that the size of the population usually has little to do with standard 
error and, hence, with reliability of a sample. The reduction in standard error due to 
sampling from a finite population is negligible unless the sample contains a large 
proportion of the total population, say as much as twenty per cent. 

To conclude, this treatment of modern market research cannot be recommended. 
A footnote (p. 46) states that it has been translated into several languages, which is 
indeed unfortunate. According to the author’s own dictum, ‘bad research is worse 
than no research.”’ This reviewer would like to add that misleading and inaccurate 
statements can be even more damaging to a real understanding of the function of 
research in solving marketing problems than if reliance is placed solely on common 
sense and intuitive hunches. 
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